What does responsible AI mean in practice?
It means the governance holds even when the model misbehaves. A responsible AI system constrains what the AI can do, not merely what it is asked to do. It keeps a named human as the decision-maker, and it records every decision in a form that cannot be quietly tidied afterwards.
That definition is deliberately unglamorous. It does not turn on the model being clever or the policy being well written. It turns on the system being unable to do the things the organisation has decided AI must not do, and on the record being honest. Each of the controls below comes from a production system we have built for a regulated UK organisation, not from a framework diagram, and each exists because a board, an auditor or a regulator asked a question that a policy document could not answer.
Why is a policy document not enough?
Because a policy describes intent and a regulator asks for evidence. Most AI governance is performance, not protection: it looks like oversight and does not survive a regulator's question. A control that exists only on paper depends entirely on people and models behaving exactly as the paper assumes.
The gap usually opens at the handoff. One firm writes the AI policy, a different team writes the code, and the controls the board needs never make it into the system. The policy says "human oversight"; the system ships with an approve button nobody audits. The policy says "no decisions about individuals"; the model quietly ranks them anyway. Writing the governance and the code in one pass is the only reliable way we know to close that gap, which is why we treat the two as a single job.
Which controls do the most work?
Four, in our builds. Read-only database access the model cannot escape, a confidence floor with reason codes that overrules the model, an append-only ledger of named human decisions, and a substring check that fails any quotation not literally present in the source document.
Each is enforced below the model, so none depends on the model behaving. Read-only by construction means the database itself rejects any write the AI attempts: the analytics layer cannot change what it reads, whatever the prompt says. A confidence floor with reason codes lets deterministic code overrule the model and route the case to a named person whenever certainty drops below a configured threshold. The append-only decision ledger records the model, its input, its output and the human's accept, modify or reject, with a name and a timestamp, in a table that has no update or delete. And the substring check makes every AI quotation a literal substring of the source, or the extraction fails and is never published.
How does a control become evidence?
When the system produces the record itself. An append-only ledger of accept, modify or reject decisions, each with a named person and a timestamp, is an audit trail by construction. A disclosure generated from that record, rather than typed from memory at the end, cannot drift from what actually happened.
This is the move that turns engineering into governance. In a surveying build, the AI disclosure at the foot of each report is assembled from the actual ledger of what the surveyor accepted, modified or rejected, which is precisely the disclosure expectation RICS now sets. The pattern generalises to any regulated profession, as we show in what a RICS AI disclosure teaches every regulated profession. It is also how a board evidences the UK's five principles: accountability becomes a named decision row, transparency becomes a checkable citation, contestability becomes a recorded human review. The duties those controls answer to are listed in the UK AI Regulation Tracker.
Can these controls be retrofitted?
They can, but the boundary is far cheaper to design before deployment than after an incident. Buying capability first and governing it later fails reliably, because the evidence the controls produce is exactly what an after-the-fact review cannot reconstruct.
If a system is already live, the honest sequence is to map what it can currently do, constrain that to what it should do, and instrument the decisions it influences. That is the shape of our governance engineering review. If you are earlier than that, start by testing the claim every vendor and every internal team will make to your board: that a human is in the loop and the data is safe. The Board AI Scorecard is a fast first check on whether your organisation could evidence either, and the articles below show each control in production detail.



