Skip to content
All insights

AI governance for UK boards

Make your AI risk register living evidence, not a spreadsheet

An AI risk register that only updates quarterly is already stale. Structure it with NIST's Govern-Map-Measure-Manage and feed it from the systems themselves.

Dr Karl George MBE8 min readUpdated 29 May 2026Researched and drafted with AI assistance
A navy grid of cells with one violet diagonal line, a static register turning live

A board we worked with kept its AI risks in a tab on the corporate risk spreadsheet. Row 14 read: "AI tooling — hallucination risk — owner: CTO — mitigation: human review — RAG: amber." It had said exactly that for three quarters. In the same three quarters, the underlying system had processed tens of thousands of messages, changed two of its models, and shipped a new approval gate. The spreadsheet knew none of it. The risk had not stood still; only the record of it had.

That is the failure mode. Not that boards lack a register, but that the register is a snapshot taken at a meeting, describing systems that move every week. By the time it is reviewed, it describes a state the organisation has already left.

Key takeaways

  • A register maintained by hand on a board's cadence is structurally incapable of tracking a system whose model, usage and controls shift between meetings; it should be fed by the systems, not transcribed.
  • Structuring the register around NIST's Govern-Map-Measure-Manage forces every risk to have a named owner, a location, a measure with a source and date, and a control that does something.
  • "Living evidence" means the control produces its own dated record as a by-product of operating — an append-only ledger, a confidence floor, a read-only constraint — that a board can query rather than trust a status colour.
  • Every entry must be dated and labelled by provenance, distinguishing a controlled test result from a live-production reading from a point-in-time figure.
  • A row answering all four functions is living evidence; a row that fills none is "human review in place, amber" — unchanged for quarters while the system moved on.

A register that updates quarterly governs a system that changed yesterday

The pacing problem is well rehearsed at board level: capability moves faster than the cadence at which a board can ratify it. The AI risk register is where that gap becomes concrete and auditable. A quarterly spreadsheet assumes the residual risk you assessed in January still holds in April. For a static process, that assumption is usually safe. For an AI system, whose model, usage and controls all shift between meetings, it is not — and a row maintained by hand records none of those shifts.

The point is not that spreadsheets are bad. It is that a register maintained by hand, on a board's cadence, is structurally incapable of tracking a system that emits evidence continuously. The register should be fed by the systems, not transcribed from memory at quarter-end.

Structure the register around NIST's four functions, not a list of fears

Most AI risk registers are a list of things people are worried about. That is a reasonable start and a poor structure, because it conflates "what could go wrong" with "what we are doing about it" and "how we would know." A better backbone already exists.

We structure AI risk using the NIST AI Risk Management Framework's Govern-Map-Measure-Manage model. NIST AI RMF 1.0 is voluntary US guidance, not a law and not a certification — but its four functions map cleanly onto the columns a board register actually needs, and they translate the same evidence into the language of an ISO/IEC 42001 management system when you need that too.

NIST function What it asks What the register column holds
Govern Who is accountable, under what policy? Named owner, the policy or principle engaged, the decision authority
Map Where is the risk, in which system and context? The specific AI use, its data, its blast radius if it fails
Measure How do we know, and how often? The metric, its source, its threshold, the last reading and its date
Manage What control reduces it, and is it working? The enforced control, who acts on exceptions, residual rating

Govern sits above the other three as the cross-cutting accountability function. The discipline this imposes is simple: every risk you list must have a named owner, a place it lives, a way it is measured, and a control that does something. A row that cannot fill the Measure column is not a managed risk; it is a hope.

"Living evidence" means the control produces the record automatically

The phrase "living evidence" is doing real work here, so it is worth being precise. A living register entry is one where the system that runs the risk also produces the proof, as a by-product of operating, without anyone writing it down for the board.

This is the difference between a control you assert and a control you can show. Consider three patterns from systems we have built, each of which feeds a register column directly:

  • An append-only decision ledger. In a bespoke surveying system, every AI suggestion records the model used, the input, the output, and the surveyor's accept, modify or reject decision, with their name and a timestamp. The table is append-only by construction — no update or delete operations exist. The Measure column for "AI alters professional output without sign-off" is not a sentence; it is a query against that ledger. We wrote about why an append-only ledger is the audit trail AI governance needs, and you can see the surveying system itself in our AI-assisted surveying case study.
  • A confidence floor that overrules the model. In an invoice-automation build, a configurable confidence threshold (default 0.9) can override the model's decision to post and force a manual query instead, with an explicit reason code. The Manage column for "AI posts a low-quality extraction to the finance system" points at deterministic code that runs regardless of what the model wants. In a controlled test suite dated 20 May 2026, 14 of 14 scenarios returned the expected decision and query code — a test result, labelled as such, not a live-production rate.
  • Read-only by construction. In a property-operations system, the analytics database enforces transaction_read_only = on at the Postgres level, so the database itself rejects any write the model attempts. The Manage column for "AI changes records it should only read" is backed by the engine, not by a policy document.

In each case the board does not have to trust a status colour. It can ask for the evidence, and the evidence already exists, dated, because producing it is how the control works.

What this does to the four columns in practice

Reframing the register this way changes what each column contains and who maintains it.

The owner stops being a job title parked in a cell and becomes the person who acts on the exceptions the system surfaces. The measure stops being "human review in place" and becomes a figure with a source and a date — the count of overridden decisions last month, the proportion of outputs that failed an anti-hallucination check, the number of actions a system intentionally aborted rather than complete on uncertain input. The residual rating stops being a colour set by sentiment and becomes a reading you can defend, because you can show how it was derived.

This is also where the register earns its keep as audit evidence. When a regulator, an auditor or your own audit committee asks "how do you govern this AI?", the honest answer is rarely a policy. It is the artefacts: the ledger, the test runs, the exception logs, the dated thresholds. A register structured as living evidence is the index to those artefacts. It tells the questioner where to look and what they will find.

Date everything, and know which evidence is live and which is a test

A living register has a discipline a spreadsheet rarely enforces: every entry is dated, and every figure is labelled by its provenance. This matters because not all evidence is equal.

A controlled test result ("14 of 14 scenarios passed, 20 May 2026") tells you the control behaves as designed on known inputs. A production reading tells you what is actually happening to real volume. A point-in-time figure ("99.23% outbound send success as of 20 April 2026") is true on its date and may not equal final delivery to the recipient. A board that conflates these three will over-trust its own register. Labelling them keeps the register honest, and keeps you from quoting a test as if it were field performance.

The regulatory backdrop makes dating non-negotiable. The UK has no single AI statute; governance is principles-based and regulator-led, with the DSIT five principles — including accountability and contestability — applied through existing regulators. Where AI touches personal data, the Information Commissioner's Office is the lead regulator under UK GDPR. The automated decision-making regime moved under the Data (Use and Access) Act 2025: section 80 came into force on 5 February 2026, repealing the old Article 22 and introducing Articles 22A-22D. The ICO's updated guidance on automated decision-making and profiling remains in draft — its consultation closed on 29 May 2026, with final guidance expected summer 2026 — so treat any reliance on it as provisional and date the assumption in your register.

The same applies to time-sensitive external obligations. If your AI is caught extraterritorially by the EU AI Act, the high-risk application dates were pushed back under the 2026 Digital Omnibus to 2 December 2027 (stand-alone) and 2 August 2028 (embedded), agreed politically in May 2026 but still completing the EU legislative process as of late May 2026. That is precisely the kind of claim a register should hold with a "verify-by" date attached, not a settled fact.

A board's reasonable test for any AI risk row

You do not need to read the code to govern the register. You need to ask four questions of any row, mapped to the four functions:

  1. Govern — Who is named, and what gives them the authority to act when this risk fires?
  2. Map — Which exact system is this, what data does it touch, and what happens downstream if it is wrong?
  3. Measure — What is the figure, where does it come from, when was it last read, and is it a test or live?
  4. Manage — What control reduces this, and can you show me it working without anyone preparing a slide?

A row that answers all four is living evidence. A row that answers none is the line we started with: "human review in place, amber, owner: CTO," unchanged for three quarters while the system underneath it moved on.

Last reviewed: 29 May 2026.


If your AI risk register is a spreadsheet tab that updates at board cadence, it is worth a short conversation about wiring it to the systems themselves. See our approach to evidence and controls on the trust page, or get in touch.

AI risk registerNIST AI RMFboard governanceaudit evidenceaccountability

Where does your board's AI governance actually stand?

Ten questions across accountability, policy, risk, data and capability. You'll get a readiness score, where to focus first, and a recommended next step. It takes about two minutes.

Free · ~2 minutes · your score shown straight away.