Responsible AI in practice

Human-in-the-loop AI design for boards

A board-level guide to placing named human judgement, evidence and escalation rights around AI-assisted decisions.

Hamada Mahdi18 June 20268 min readResearched and drafted with AI assistance, reviewed by Karl George MBE

Human-in-the-loop AI design means a named person has authority, information and time to accept, change or reject an AI-assisted outcome before it affects a person, record or regulated decision. It is a control design problem, not a reassuring label.

This applies where AI supports decisions about people, money, professional work, regulated reporting or production records. It does not mean every low-risk drafting task needs a committee. It does mean the board should know which outcomes the system can produce automatically, which require review, and what evidence proves a person actually decided.

Key takeaways

Human oversight only counts when the person can see the relevant input, understand the AI output, change the outcome and leave a record.
The board decision is where to require human authority: before a write, before external communication, before a legal or similarly significant effect, or at a defined confidence or value threshold.
A review button is weak evidence; a decision ledger with accept, modify and reject states is stronger evidence.
UK data protection law remains relevant wherever AI uses personal data, and the ICO warns that mere human involvement does not necessarily amount to meaningful human review.
ISO/IEC 42001, NIST AI RMF and the UK's regulator-led AI principles all point towards named accountability, documented controls and traceable risk treatment.

Where human-in-the-loop AI design belongs

Start by naming the actual decision. The ICO's Article 22 fairness guidance says organisations should identify which step in the process produces or overwhelmingly determines the direct legal or similarly significant effect, because that is where human agency has to be assessed: the ICO on Article 22 and fairness. A person typing data into an AI system before the system decides is not the same as a person weighing the actual outcome.

That distinction matters for board design. Human review should sit at the point where the AI output would otherwise become an organisational act: a letter sent, a payment posted, a case prioritised, a customer classified, a report signed or a record changed. We apply the same principle in engineered controls such as the append-only AI decision ledger, confidence floors and reason codes, and read-only database access. The model may advise; a human or deterministic control decides.

In practice, there are four good places to put the human gate:

Before an external consequence: where an AI-assisted output would be sent to a customer, resident, employee, regulator or professional client.
Before a write to a system of record: where a generated decision would change an ERP, CRM, case-management or property-management record.
Before a legal or similarly significant effect: where the outcome affects access to a service, employment, finance, housing, education or another material interest.
At a threshold: where confidence, value, risk class, missing evidence or an exception code requires a person to decide.

The gate has to be designed into the workflow. A person who is shown a model answer after the decision is already made is an approver of the process, not a reviewer of the outcome.

What the board needs to decide

The board does not need to approve every prompt or model version. It does need to approve the decision rights. Those rights should be written plainly enough that a product owner, compliance lead and internal auditor can all test the same rule.

The board should decide:

Which outcomes may be automated: for example, low-risk summarisation, internal triage or draft preparation.
Which outcomes require a named human: for example, external letters, regulated reports, customer-affecting classifications, spending above a threshold or exceptions with missing evidence.
What authority the reviewer has: accept, modify, reject, escalate or override, with no default path that treats silence as approval.
What information the reviewer must see: source material, AI output, confidence or reason code, known limits, previous decisions and affected-person context.
What record is kept: reviewer name, role, decision, timestamp, input, output, change made and reason where relevant.
What is reported back to the board: exceptions, overrides, near misses, contested decisions, threshold changes and unresolved control failures.

This is where many designs fail. They put a human somewhere in the workflow, then ask that person to carry responsibility without the authority or evidence needed to use judgement. The ICO's guidance on explaining AI-assisted decisions says organisations may need to explain how and why a human judgement assisted by AI was reached, and to make accountability traceable: ICO and Alan Turing Institute guidance on explanations. A board should therefore ask for the review record, not just the workflow diagram.

Controls and evidence to require

Human oversight becomes governable when it produces evidence. The board should ask for a control table before launch and compare it with the live AI risk register after launch.

Control	Evidence to keep	Owner
Decision inventory	List of AI-assisted decisions, risk class, affected parties and system of record touched	Product owner with risk lead
Review authority	Written accept, modify, reject, escalate and override rights for each decision type	Accountable executive
Reviewer competence	Role requirements, training records and access to source material	Function lead
Decision ledger	Immutable record of AI input, AI output, human decision, name, timestamp and reason	System owner
Exception routing	Confidence floors, missing-evidence rules, value thresholds and reason codes	Product owner with compliance
Contestability route	Named route for challenge, response time, reviewer authority and outcome record	Compliance or customer operations
Board reporting	Monthly or quarterly control pack with overrides, rejected outputs, drift signals and unresolved incidents	Risk committee secretary

The table should be tested against real cases. Pick one accepted decision, one modified decision, one rejected decision and one contested decision. If the team cannot show the source material, the model output, the person who decided and the reason the path was taken, the control is not ready.

There is a design choice here: do not make the human reviewer fight the interface. Review screens should show the evidence in the order a decision-maker needs it: source first, AI output second, reason code or confidence third, available actions last. The system should make the decision options explicit. It should never hide "reject" behind an extra menu while making "approve" the largest button.

Framework mapping for UK boards

Frameworks do not give one universal answer to human oversight. They do give a defensible mapping from board intent to operational evidence.

Framework or regime	What it asks of the organisation	What good human oversight supplies
ISO/IEC 42001	ISO describes ISO/IEC 42001 as a management-system standard for establishing, implementing, maintaining and continually improving an AI management system, with traceability, transparency and reliability listed as benefits: ISO/IEC 42001.	Defined decision rights, documented controls, review records and periodic improvement actions.
NIST AI RMF	NIST says AI RMF 1.0 is for managing AI risks and organises activity through Govern, Map, Measure and Manage functions: NIST AI RMF 1.0. The framework also says human judgement should be used when setting trustworthiness metrics and thresholds.	Board-approved thresholds, measured reviewer performance, tracked overrides and documented risk treatment.
UK GDPR and ICO	The ICO says mere human involvement in the AI lifecycle does not necessarily qualify as meaningful human review, and that human review should relate to the actual outcome where Article 22 is engaged: ICO Article 22 guidance.	A reviewer with authority to change the outcome, access to the relevant evidence and a recorded decision.
UK regulator principles	The government's 2024 response lists five cross-sector AI principles for regulators: safety, transparency, fairness, accountability and contestability: GOV.UK AI regulation response.	Human gates aligned to risk appetite, clear explanations, named accountability and a challenge route.
FCA-regulated financial services	The FCA says its AI approach is principles-based, focused on outcomes and reliant on existing frameworks rather than extra AI-specific rules: FCA approach to AI. It highlights Consumer Duty and accountability and governance as relevant frameworks.	Decision rights mapped to senior-manager accountability, customer outcome evidence and escalation rules.

The practical point is not to write "human-in-the-loop" into a policy and stop. The practical point is to show how the human gate satisfies the relevant management-system, risk-management, data-protection and regulator expectations for the actual use case.

Common mistakes and the next step

The first mistake is treating human oversight as a staffing pattern. A team can have many reviewers and still have weak control if those reviewers cannot change the outcome, cannot see the source, or are measured only on speed.

The second mistake is using the same review model for every risk. A marketing draft, an invoice posting, a housing allocation recommendation and a regulated professional report do not need the same gate. They need gates matched to effect, reversibility, data sensitivity and evidence quality.

The third mistake is trusting "approve" without recording "modify" and "reject". A board cannot learn from human oversight if the only stored outcome is approval. Modified and rejected outputs are the useful evidence: they show where the model is weak, where thresholds are wrong, and where staff need better context.

The fourth mistake is burying contestability outside the product. The GOV.UK principles include contestability and redress, and ICO explanation guidance points to human review routes for affected individuals. If a user cannot challenge an AI-assisted decision, or if the challenge cannot reach someone with authority, the human gate is incomplete.

For a live system, the next step is a controls review. Use the Board AI Scorecard to test whether your board can name the AI decisions already in use. If the system is already near customers, records or regulated work, the AI governance diagnostic is the better route because it reviews decision rights, evidence and framework alignment together. You can also see how we express these controls on the Trust page, review shipped examples in case studies, or use the services page to choose a design, assurance or remediation path.

Last reviewed: 18 June 2026.

Sources: ISO/IEC 42001 · NIST AI RMF 1.0 · ICO Article 22 and fairness guidance · ICO explaining decisions made with AI · GOV.UK AI regulation response · FCA approach to AI

human oversightAI controlsAI governanceUK GDPRISO 42001

Related insights

Abstract navy control grid with a violet approval gate, representing model governance evidence

Responsible AI in practice

AI model governance controls for UK boards

What AI model governance means for UK boards: named ownership, versioning, drift monitoring and evidence mapped to ISO 42001 and NIST AI RMF.

ArticleHamada Mahdi18 June 202613 min read

Abstract navy and violet citation paths linking AI answer fragments back to verified source panels

Responsible AI in practice

AI source citation controls for board evidence

How boards should require citation controls that prove AI answers trace back to real sources before they reach regulated work.

ArticleHamada Mahdi18 June 20268 min read

Near-white abstract of navy nodes connected through a central violet control frame with one open gate, governed agents routed through explicit control

Responsible AI in practice

AI agents for business operations: what works, what fails

What AI agents can actually run in operations today, what separates a production agent from a chatbot demo, and why so many agent projects get cancelled.

ArticleHamada Mahdi10 July 20267 min read

Where does your board's AI governance actually stand?

Ten questions across accountability, policy, risk, data and capability. You'll get a readiness score, where to focus first, and a recommended next step. It takes about two minutes.

Take the Board AI Scorecard Or book a governance review

Free · ~2 minutes · your score shown straight away.