Home Methodology Principal Insights Take the Assessment Strategy Session
Insights
Series • AI Governance Infrastructure • Part 1 of 3

When Prompt Quality Becomes an AI Governance Problem

Output quality varying because prompt quality varies is a characteristic of generative AI. Output quality varying because prompt quality is unmanaged is a governance failure.

February 2026 · Dr. Gbemisola Adetayo

One of the most overlooked issues in enterprise AI governance is not the model. It is what happens before the model does anything at all.

The statement that output quality varies with prompt quality is technically accurate. But it is also incomplete in a way that matters. Stated as a standalone observation, it describes how generative AI works. It does not describe where governance has failed. Understanding the difference between those two things is, in my view, one of the more important distinctions for organizations that are serious about deploying AI responsibly at scale.

The Distinction That Changes Everything

There is a useful way to separate three different layers of questions that organizations need to be asking about their AI programs. The technology layer asks whether the model works. The operational layer asks whether people are using it correctly. The governance layer asks something different: are outputs reliable, repeatable, and appropriately controlled?

That third question is where most organizations are significantly underinvested. And it is the question that prompt variability directly implicates — not because prompt quality affects output, which it always will, but because organizations have largely left prompt quality unmanaged.

The distinction is important. Variability in output quality is a characteristic of generative AI systems. These models are probabilistic. The same model, given clear and well-structured instructions, can produce accurate, traceable, decision-ready outputs. Given vague or unconsidered instructions from the same user working on the same task, it can produce outputs that are misleading, incomplete, or factually wrong. That is not a defect. That is how the technology works.

The governance problem emerges not from that variability itself, but from the organizational conditions that allow it to compound unchecked.

What Unmanaged Variability Actually Costs

Consider a straightforward scenario. Fifty analysts across an organization are using a generative AI tool to summarize internal policy documents. Each analyst has developed their own approach to prompting, because no standard exists. One analyst asks the model to summarize the document, highlight obligations, deadlines, and responsible parties, and cite section numbers. Another asks the model to explain what the document means. Both analysts are working on the same class of task. Both are using the same model. The outputs they produce are not equivalent — one is traceable and accurate, the other is likely to contain missing requirements and hallucinated interpretations — but from a process standpoint, the organization treats them as equivalent because it has no mechanism to distinguish them.

The governance concern here is not that the second analyst prompted poorly. The governance concern is that the organization has no controls in place to prevent that from happening, no way to detect it when it does, and no documented basis for understanding why one output is reliable and another is not.

Governance exists to create consistency, reliability, accountability, and managed risk. Uncontrolled prompting undermines all four simultaneously. When output quality depends primarily on individual prompting skill rather than organizational process, outcomes become a function of talent distribution rather than institutional capability. That is a governance failure — not a training deficit, not a technology limitation, a governance failure.

What Absence of Controls Looks Like in Practice

Organizations that have not addressed prompt governance tend to exhibit a recognizable set of conditions. There are no prompt standards, so every individual develops their own approach and output quality varies accordingly. There are no approved templates for high-stakes or frequently recurring tasks, which means that identical work produces different answers depending on who completes it. There are no validation requirements, so users extend trust to AI outputs without a systematic mechanism for verifying accuracy, and hallucinations enter business processes without detection. Prompting expertise concentrates in a small number of individuals, creating key-person risk that organizations would immediately recognize and address if it appeared in any other critical operational domain. And because there is no documentation of what produced reliable outputs, there is no institutional memory — no reproducibility, no auditability, no basis for continuous improvement.

The compounding effect of these absences is that organizations accumulate operational risk at the exact rate they scale AI adoption. Every new user, every new use case, every new workflow that runs on unmanaged prompting extends the exposure. And because the outputs often look reasonable at the surface, the risk remains invisible until something consequential goes wrong.

Adoption Problem or Governance Problem?

A reasonable counterargument is that what I have described is primarily an adoption problem: people do not know how to prompt effectively, and the solution is better training. That view is partly correct, and I want to be precise about where it is right and where it is insufficient.

It is correct that users who understand prompting produce better outputs. Capability development matters and organizations should invest in it. But training individuals to prompt better does not create organizational controls. It relocates the dependency from one individual's ignorance to another individual's knowledge. The underlying structural problem — that output quality is a function of individual skill rather than institutional process — remains unchanged.

Mature AI governance programs recognize that governance and adoption are not sequential. Governance does not follow adoption once the program is established enough to warrant oversight. Governance shapes adoption from the beginning by creating the policies, standards, templates, training, monitoring, and continuous improvement mechanisms that convert individual capability into repeatable organizational practice. The distinction matters because it changes what responsible AI transformation requires at the design stage, not the audit stage.

The Analogy That Holds

Aviation is a useful reference point here, not because AI and aviation are equivalent in their risk profiles, but because aviation has worked through precisely this problem at scale. Pilot skill varies. That variation is expected and unavoidable, and it is not itself a governance failure. What aviation recognized early is that when safety depends entirely on individual talent — when there are no checklists, no standard operating procedures, no validated protocols for recurring tasks — the system is fragile in a way that training alone cannot fix. The controls exist not to replace skill but to make skill replicable, auditable, and less dependent on any one person performing at their best on any given day.

AI programs are not different in this respect. Output quality varying because prompt quality varies is a natural characteristic of the technology. Output quality varying because prompt quality is unmanaged is an organizational design choice, and it is one that governance has a clear responsibility to address.

A More Useful Framing

The statement worth making — from a governance perspective — is not that output quality varies with prompt quality. The statement worth making is this:

The absence of standardized prompting practices and output validation creates inconsistent output quality and increases operational, legal, and compliance risk at a rate proportional to the scale of AI adoption.

That framing matters because it locates the problem correctly. It is not a technology problem. It is not purely a training problem. It is a governance problem — specifically, the absence of the controls that governance exists to provide.

When assessing AI governance maturity, the most productive diagnostic question is not whether users know how to prompt. It is whether the organization has built the institutional infrastructure that makes good prompting a process rather than a talent. Those are different questions, and they lead to different interventions.

As AI adoption scales, the organizations that build durable capability will not be the ones with the best individual prompters. They will be the ones that have converted good prompting practice into repeatable organizational infrastructure — embedded in standards, validated through controls, and designed to produce consistent outputs regardless of who is doing the work. That is what prompt governance looks like in practice. And it is one of the more consequential gaps in how most enterprise AI programs are currently designed.

Continue the series

Next: Your AI Outputs Are Only as Good as Your Validation Layer

Read Part 2 →

See where your organization stands

Take the Assessment

Dr. Gbemisola Adetayo · Founder & Principal, Arrell Advisory · This article is the first in a series on the governance infrastructure that enterprise AI programs are currently missing.