Every few weeks a new product crosses my desk promising "AI visibility": discovery of shadow AI usage, model inventories, risk scores, traffic-light summaries for the board. Some of these tools are decent. The telemetry is real. And yet I keep meeting organisations that own two or three of them and still cannot get a sensitive workload through a customer security review.
The reason is simple enough once you see it. A dashboard tells you what is happening. Assurance is an account of why what is happening is acceptable — and that account has to be constructed by people, because it is made of judgements, not measurements.
The question dashboards can't answer
Picture the moments when AI assurance is actually tested. A bank's third-party risk team sends a 200-question due-diligence pack before renewing a contract. An internal auditor asks who approved connecting the copilot to the case-management system. A regulator's supervisory letter asks how the firm satisfies itself that model-enabled decisions are subject to adequate oversight. A board member, after reading something alarming in the weekend papers, asks the CISO directly: are we exposed to this?
Not one of those questions is answered by a risk score. Each of them is asking for a position: here is what we run, here is what it can and cannot do, here is who decided that and on what basis, here is the evidence, and here is the risk we have knowingly accepted. A dashboard can supply exhibits for that position. It cannot supply the position.
I have watched a team respond to a customer questionnaire by exporting screenshots from their posture tool. The customer's reviewer — quite rightly — sent it back with a one-line reply: this tells us what your tooling sees, not what your organisation has decided. The renewal slipped a quarter.
Why the confusion persists
Tools are easier to buy than judgement, and a purchase feels like progress. There is a budget line for software; there is rarely a budget line for "construct a defensible architecture position", even though the second is what the audit finding actually requires. So organisations accumulate visibility and remain unable to give an account of themselves — measured, but inarticulate.
There is also a quieter structural cause. The account I am describing has to hold across four audiences at once: engineering needs it to be technically true, security needs it to map to controls, risk needs it to express appetite, and leadership needs it short enough to defend in a meeting. In most organisations no single role owns the document that satisfies all four. The dashboard ends up standing in for it because at least the dashboard exists.
What the work actually looks like
The organisations that do this well maintain something quite old-fashioned: a small, current set of decision records and architecture positions for their material AI workloads. What the workload is for. What it can access. The hosting and supplier arrangement. The controls relied upon, with pointers to where the evidence lives — and yes, some of those pointers go to dashboards; this is where the tooling earns its keep. The risks accepted, with names and dates against them.
When the questionnaire arrives, they answer it in days, consistently, in language that survives scrutiny — because the thinking was done before the question was asked. When the workload changes materially, the position is revised, which is the part that takes discipline rather than software.
None of this argues against buying tools. It argues against mistaking the instrument panel for the airworthiness certificate. Telemetry is cheap now. A position someone is prepared to sign is as expensive as it ever was, because it is made of judgement — and judgement is the one thing the vendors are not selling.