Skip to content
All insights

Responsible AI in practice

Make every AI claim quote a real source, or fail

How we force every AI quotation to be a literal substring of the source, and block a denied vocabulary in code, in our insolvency build.

Hamada Mahdi8 min readUpdated 29 May 2026Researched and drafted with AI assistance, reviewed by Dr Karl George MBE
A violet fragment fitting an exact navy slot while mismatched fragments are rejected, representing a substring match check

A language model read a set of filed accounts and reported that a six-figure sum was "owed to the company by the directors and shareholders." That sentence later turned out to be word-for-word what the filing said. We did not take the model's word for it. Before that line was allowed into the internal report, our code searched the source document for the exact string the model had returned. The check passed because the text was really there. Had the model paraphrased, rounded the figure, or invented a sentence that read plausibly but appeared nowhere in the filing, the extraction would have thrown an error and the claim would never have reached a human.

That is the control this post is about: an AI anti-hallucination check that treats a quotation as valid only when it is a literal substring of the source it claims to be quoting. It is one of the controls in our insolvency intelligence build, alongside a denied-vocabulary blocklist enforced in code. Both are deliberately unglamorous. Both are the difference between an AI output you can put in front of a regulated firm and one you cannot.

Key takeaways

  • A quotation passes only when it appears character for character in the source; a paraphrase, rounded figure or invented line fails the check and is never published.
  • The substring test is deterministic and runs after generation, so it does not ask the model to police itself or rely on a confidence score.
  • A denied-vocabulary blocklist is enforced twice in code — injected into the prompt and re-scanned on output — and a draft containing a banned term is rejected, not silently cleaned.
  • Nothing reaches an external party without a recorded approval naming who approved it and when, the point where responsibility transfers from system to person.
  • These controls are the operational evidence the UK's pro-innovation principles and ISO/IEC 42001 expect: testable, documented and tied to a named owner.

A quote should be a quote, not a paraphrase the model is confident about

Most discussion of "AI hallucination" stops at the model. People reach for better prompts, retrieval augmentation, or a second model grading the first. Those help. None of them give you a guarantee, because they all ask a probabilistic system to police itself.

The substring check sidesteps the model entirely. It is a deterministic test that runs after generation:

  1. The model is asked to extract claims from a source document, and for each claim to return the exact supporting passage.
  2. For every returned passage, code checks whether that passage appears, character for character, in the source text.
  3. If it does not appear, the extraction fails. The job stops. Nothing is published.

This is fail-on-mismatch, not warn-on-mismatch. There is no confidence score to tune and no threshold to argue about. Either the words are in the document or they are not. A model that confabulates a quotation, or subtly reshapes a real one, cannot pass a test that asks only whether the bytes match.

We use it where the stakes are highest: when an AI reads a company's filed accounts and surfaces evidence of an overdrawn director's-loan-account exposure. In that build, the model runs at temperature zero, and every figure it reports must be anchored to a verbatim line in the filing. The example above is a demo extraction run against a real public filing from Companies House; it is illustrative of how the control behaves, not a commercial result, and the company is not named.

Why "literal substring" and not "semantically similar"

Semantic similarity is the wrong test for a quotation, because the failure you are guarding against is precisely the case where the meaning drifts a little. "£312,480 owed by the directors" and "approximately £300,000 owed by a director" are semantically close. One is a quote; the other is a fabrication that would embarrass you in front of an insolvency practitioner.

A substring check has no tolerance for that drift by design. It is brittle in exactly the way you want: it breaks loudly the moment the model stops quoting and starts composing.

The cost is real and worth naming. The check rejects legitimate-but-reformatted text, currency symbols normalised differently, or whitespace that does not match. So the discipline sits upstream: you normalise the source and the candidate quote the same way before comparing, and you keep the comparison narrow. You are trading a little engineering friction for a property you can state plainly to a board: no claim reaches a human unless its supporting words exist in the source.

A denied vocabulary belongs in code, not in a prompt

The second control answers a different risk. In insolvency and corporate-debt work, language is a legal exposure. A drafted contact letter that implies misconduct, alleges fraud, or uses demanding phrasing such as "you owe" or "we demand" is not a tone problem. It is a defamation and conduct problem.

The obvious move is to tell the model not to use those words. We do that too. But a system prompt is a request, not a guarantee, and the words you most want to exclude are exactly the ones a model trained on adversarial debt-recovery text will reach for. So the denied vocabulary is enforced twice, in code:

  • On the way in. The blocklist is injected into every prompt, so the model is told the constraint explicitly.
  • On the way out. Every generated draft is re-scanned against the same list. A draft containing a denied term is rejected, not quietly cleaned up and shipped.

Rejection rather than redaction matters. If you silently strip a banned word, you hide the fact that the model tried to use it, and you lose the signal that a prompt or a model needs attention. A rejected draft is an event you can log, count and review. The list lives in version control, so a change to what counts as unacceptable language is a reviewed code change with an author and a date, not an edit someone made to a prompt one afternoon.

Approach What it gives you What it cannot promise
Prompt instruction only A model that usually complies Compliance under every input
Output re-scan in code A hard gate on forbidden output That the model never tries
Both, with rejection Defence in depth plus a review signal Zero engineering cost

These two controls compose. The substring check governs whether a claim is true to its source; the blocklist governs whether the language built around that claim is safe to send. Neither replaces the other.

The model never decides; a named person does

Code-enforced checks set a floor. They do not lift the obligation to keep a human accountable for what goes out. In this build, nothing reaches an external party without a recorded approval: a draft contact letter sits in a workflow until a named person approves it, and the system records who approved it and when. That recorded approval is the point at which responsibility transfers from the system to a person, and it is the same pattern we use across our work, from a decision ledger that records accept, modify or reject against a named individual to cost-approval gates in operational systems.

The architecture is deliberately conservative around it. The pipeline pulls only from official public sources, scores each opportunity through a transparent eight-factor weighted model whose weights must sum to exactly 1.0 (the code refuses to run otherwise), and logs every model call with its prompt version and raw response so a result can be reproduced. The blocklist and substring check sit inside that frame. The product is early-stage, with a working demo rather than a long production record, and we describe it that way; what is settled is the design.

How this maps to what UK regulators already expect

None of this is compliance theatre, and it is worth being precise about why, because the UK has no single statute to point to. There is no enacted "UK AI Act"; domestic AI governance is principles-based and regulator-led, structured around the five pro-innovation principles set out by DSIT (as confirmed in the government response of 6 February 2024).

Two of those principles map directly onto the controls above:

  • Appropriate transparency and explainability. A claim you can trace to a verbatim line in a source document is the most concrete form of explainability there is. The substring check is what makes "every figure is sourced" an enforced property rather than an aspiration.
  • Accountability and governance. The recorded approval gate names the person responsible for each external communication.

Where personal data is involved, the binding regime that already applies in the UK today is the UK GDPR, enforced by the ICO. The Data (Use and Access) Act 2025 reshaped the rules on automated decision-making: its section 80 came into force on 5 February 2026, replacing the old Article 22 with new Articles 22A to 22D and preserving the right to human review of significant decisions. The ICO's updated guidance on automated decision-making and profiling was in draft as of late May 2026 (its consultation closed on 29 May 2026, with final guidance expected in summer 2026), so treat the detail as provisional. A pipeline where a person approves every external action, and where the model's claims are checked against source before a human ever sees them, is well placed for that direction of travel.

For an organisation building an AI management system aligned to ISO/IEC 42001 (the first international AI management system standard, published in December 2023), controls like these are the operational evidence the standard expects: documented, testable, and tied to a named owner. We help organisations design that backbone and prepare for certification by a UKAS-accredited body; we do not issue certification ourselves.

What to take to your own build

If you take one thing from the insolvency work, take the test, not the tooling. Three questions are worth putting to any team shipping AI that quotes or drafts on your behalf:

  1. When the system says it is quoting a source, does anything other than the model verify that the quote exists in the source?
  2. When language is a legal exposure, is the forbidden vocabulary enforced in code on the output, or only requested in a prompt?
  3. Can you name the person who approved each external action, and find the record?

If the honest answer to any of these is "the model handles that," you are trusting a probabilistic system with a guarantee it cannot give. Deterministic checks are cheap to build and easy to explain to a board, which is exactly why they belong at the boundary between what the model produces and what leaves the building. You can see the same pattern at work in our insolvency intelligence case study.

Last reviewed: 29 May 2026.

If you are weighing where AI sits in a regulated workflow and want a candid view of which controls actually hold, our services page sets out how we work, or you can start a conversation.

anti-hallucinationAI governance in codedenied-vocabulary blocklisthuman approvalinsolvency

Where does your board's AI governance actually stand?

Ten questions across accountability, policy, risk, data and capability. You'll get a readiness score, where to focus first, and a recommended next step. It takes about two minutes.

Free · ~2 minutes · your score shown straight away.