AI Strategy & Transformation

The Agentic Execution Gap:
Why 90% of AI Agent Initiatives Never Reach Operational Scale

Most organizations believe AI agent success is determined by the quality of the technology. It is not. The gap between a working agent and a working business is not intelligence. It is execution.

By Erik R. Miller 33 min read
← Back to all posts
The Agentic Value Decay Curve™ — Erik R. Miller | ERM Advisory A framework diagram by Erik R. Miller showing how expected value from AI agents erodes across six stages between deployment and business impact. The intended value runs flat across the top at 100 percent, while the realized value declines stage by stage: 100 percent Expected Value, 80 percent Agent Capability, 60 percent Workflow Adoption, 45 percent Human Trust, 30 percent Measurement Visibility, and 15 percent Business Impact. The widening gold area between the two lines is the Agentic Execution Gap, representing roughly 85 percent of expected value lost before it reaches the business. THE AGENTIC VALUE DECAY CURVE™ How 100% of expected value becomes 15% of business impact — stage by stage EXPECTED VALUE · 100% THE AGENTIC EXECUTION GAP ~85% of expected value lost between deployment and impact 80% 60% 45% 30% 15% × capability loss × weak adoption × trust deficit × blind measurement × no business pull 01 EXPECTED VALUE 02 AGENT CAPABILITY 03 WORKFLOW ADOPTION 04 HUMAN TRUST 05 MEASUREMENT VISIBILITY 06 BUSINESS IMPACT © ERIK R. MILLER | ERM ADVISORY · VERSION 1.0 ERIKRMILLER.COM
The Agentic Value Decay Curve™ — Erik R. Miller, ERM Advisory. Value erodes stage by stage from 100% expected value to roughly 15% business impact; the widening gold area is the Agentic Execution Gap.

I have spent my career in the space between the thing that gets approved and the result that actually arrives. Artificial intelligence has not closed that space. It has widened it. In its 2025 study of enterprise AI, MIT's Project NANDA found that roughly 95% of enterprise generative-AI pilots delivered no measurable return on the profit-and-loss statement.1 McKinsey's State of AI arrives from the other direction and lands in the same place: only about 6% of organizations report significant enterprise-wide impact from AI.2

Read those two numbers together and the headline of this article stops being a provocation and becomes a conservative estimate. If only around one in twenty pilots reaches measurable return, and only about one in sixteen organizations reaches enterprise-wide impact, then saying that roughly 90% of AI agent initiatives never reach operational scale is, if anything, generous to the field. I use 90% deliberately as a rounded, defensible synthesis of those cited findings — not as a dramatized statistic.

Here is what makes the pattern strange. The technology mostly works. The demos are real. The models are extraordinary and improving monthly. And still the value does not arrive. That contradiction — capable agents, absent outcomes — is the subject of this article. The organizations that are stuck did not buy worse technology. They never solved the operational problem of turning that technology into how the business runs. I call the distance between those two states the Agentic Execution Gap.

If you lead an organization, you have probably watched a version of this story unfold. The board approves an AI initiative with real conviction. A pilot is stood up, and it works — the demo is genuinely impressive, the early users are enthusiastic, and leadership celebrates a milestone. Then the months pass. Usage that spiked at launch quietly flattens. The workflow it was meant to transform looks much as it did before. When the budget review arrives and someone asks what the initiative returned, the room goes quiet — not because nothing happened, but because no one can prove what did. The investment is questioned. The conclusion, almost always, is that the technology was not ready.

It was. What was missing was everything the technology could not supply on its own: the integration, the trust, the measurement, and the change in how the business actually works. That missing layer — not the model — is the subject of this article.

Definition — The Agentic Execution Gap™

The Agentic Execution Gap is the gap between successfully deploying AI agents and successfully integrating them into business operations at scale. It is not produced by weak technology. It is produced by small, compounding losses across five layers of execution — strategy, workflow, oversight, measurement, and adoption. Most organizations do not have an AI capability problem. They have an Agentic Execution Gap problem.

“The gap is not intelligence. The gap is execution. The agent was never the hard part — the operating model around it was.”
Erik R. Miller
Executive Summary

AI agents are advancing faster than the organizations meant to use them. The constraint on value has moved from the model to the operating model. This article defines the Agentic Execution Gap — the distance between a deployed agent and an adopted one — and gives leaders a way to see it, measure it, and close it. It introduces four original frameworks: the Agentic Value Decay Curve, which shows how expected value erodes from 100% to roughly 15% between deployment and impact; the five-layer ERM Agentic Execution Framework; the ERM Agentic Maturity Model; and the 90-Day Agentic Scale Roadmap. It closes with a 15-question executive self-assessment. The argument throughout is simple: agent capability is necessary and nowhere near sufficient, and the organizations that win the agentic era will be the ones that treat execution, not intelligence, as the scarce resource.

The Argument in Brief

Key Takeaways

  • AI agent failure is overwhelmingly an operational failure, not a technical one — capability is rarely the binding constraint.
  • Expected value decays across five layers; because the losses multiply rather than add, a capable agent can still produce almost no business impact.
  • Adoption is not execution. People can use agents widely while the business changes nothing — and unchanged operations produce no durable value.
  • ROI is unprovable without a baseline captured before deployment; measurement is the layer most often skipped and most often fatal.
  • Governance is not a brake on agents — it is the precondition for trusting them with real work.
  • Scaling is sequential: close one workflow end to end, prove it, then expand. Closing the gap is an operating discipline, not a procurement decision.
The Agentic Execution Gap — By the Numbers
FindingFigureSource
Enterprise generative-AI pilots delivering no measurable P&L return~95%MIT Project NANDA, 2025
Organizations reporting significant enterprise-wide EBIT impact from AI~6%McKinsey, State of AI 2025
Organizations that have redesigned any workflow around AI~21%McKinsey, State of AI 2025
Agentic AI projects expected to be canceled by end of 202740%+Gartner, 2025
Companies adopting AI agents — yet most say half or fewer employees use them daily79% / 68%PwC AI Agent Survey, 2025
Fastest-rising barrier to scaling generative AI in the enterpriseRegulation & riskDeloitte, 2024

By the Numbers — the case for treating AI as an execution problem, not a capability problem. All figures attributed to primary research; see references for full citations.

Why AI Agent Deployments Fail

The comfortable explanation for AI failure is that the technology is not ready. It is comfortable because it implies patience is the cure: wait for the next model, the next context window, the next benchmark, and the value will arrive. But that is not what the evidence shows, and it is not what I see inside organizations. The agents that stall are not noticeably less capable than the agents that succeed. What differs is everything that was supposed to happen around the agent — the integration, the trust, the measurement, the change in how work is done.

A deployed agent is a snapshot. An adopted agent is a film. Deployment is a moment — the agent works in a demo, clears a procurement review, lands in a sandbox. Adoption is the thousands of subsequent decisions, handoffs, and habit changes that either carry the agent into the daily life of the business or quietly leave it stranded beside the real work. Leadership teams pour energy into the snapshot and almost none into the film. So the gap opens, not dramatically, but in the ordinary friction of getting an organization to actually rely on something new.

Answer — Why do AI agent deployments fail?

AI agent deployments fail when capability never becomes adoption. Agents are built or bought, demonstrated successfully, then stall because they are not embedded in real workflows, the people around them do not trust them, leadership cannot measure their output, and the business never changes how it works. Failure happens in operations, not in the model.

Notice that none of those failure points is technical. This is the central misdiagnosis of the agentic era: organizations treat an operational problem as a capability problem, and so they respond to a stalled initiative by shopping for a better model instead of building a better operating model. MIT's researchers found the same thing from the data — generic, capable tools stalled in enterprise use precisely because they did not learn from or adapt to the organization's actual workflows.1 The intelligence was present. The integration was not.

Operator's Note

“Nobody fails to scale AI because the model was not smart enough. They fail because the organization never changed to let the model matter.”

Erik R. Miller — ERM Advisory

The Agentic Value Decay Curve™

To see why capable agents produce so little, follow a single unit of expected value as it travels from deployment to business impact. It does not lose its worth in one place. It loses a little at each stage it passes through — and, as with all execution, the losses do not add. They multiply. The Agentic Value Decay Curve is the picture of that journey, and it is the most important idea in this article.

Begin with 100% of the value a leadership team expected when it approved the initiative. The agent itself is excellent but imperfect in context, so perhaps 80% of that value survives as real agent capability on the organization's actual tasks. Then the agent has to live inside a workflow; where integration is shallow, workflow adoption carries maybe 60% forward. The people in that workflow have to trust it enough to depend on it, and human trust — the hardest layer — might pass 45%. Leadership has to be able to see what the agent produces, and where measurement is thin, visibility drops the realized value to around 30%. Finally the business has to actually change how it operates, and absent that, business impact lands near 15%.

The Agentic Value Decay Curve — how 100% of expected value becomes 15% of business impact
StageValue SurvivingWhere the Value Leaks
Expected Value100%The business case at approval — the full promise of the initiative.
Agent Capability80%The agent is excellent in the demo but imperfect on the organization's real, messy tasks.
Workflow Adoption60%The agent sits beside the workflow rather than inside it; people route around it.
Human Trust45%The people who must rely on it do not yet trust it enough to stop double-checking.
Measurement Visibility30%Leadership cannot see what the agent produced, so its value is invisible at review time.
Business Impact15%The organization never changed how it works, so almost none of the promise reaches the P&L.

Illustrative model. Realized business impact ~15% · Agentic Execution Gap ~85% · Values are directional, not measured constants.

The exact percentages are illustrative, not measured constants — do not treat 15% as a law of nature. The mechanism, however, is real and unforgiving: when value must survive five sequential layers, each merely good, the survivors multiply. Five layers at 80% effectiveness yield roughly 33% realized value. Five layers at 60% yield about 8%. This is why an organization can be competent at every individual step and still watch the overwhelming majority of its expected value disappear — and why the disappearance is so hard to see. No single layer failed. Everything was merely good enough, and good enough compounds downward.

Answer — What prevents AI agents from scaling?

AI agents fail to scale when each layer of execution leaks a little value: unclear business mandate, shallow workflow integration, low human trust, invisible measurement, and no change in how the business operates. Because these losses multiply rather than add, a capable agent can still deliver almost no business impact. Scaling requires closing every layer, not improving the model.

The same mathematics that punishes you also rewards you. In a multiplicative system, gains compound upward exactly as losses compound down. Lift every layer modestly — integrate a little deeper, earn a little more trust, measure a little better — and realized impact climbs far faster than any single improvement would suggest. That asymmetry is the entire strategic logic of closing the gap: broad, disciplined improvement across all five layers beats a heroic investment in any one. And to improve the layers, you first have to name them. That is the framework.

The ERM Agentic Execution Framework™

The framework breaks agentic execution into five layers. Each is a place where agent capability can be carried forward or lost. They are sequential in logic but simultaneous in practice: a strong organization holds all five at once, and a gap in any single layer is enough to break the chain. You do not need all five to fail to underperform. You need only one.

The Framework Five Layers of Agentic Execution
01 Strategy Alignment Is the agent solving a problem the business values?
02 Workflow Integration Does the agent live inside the real workflow?
03 Human Oversight Do people trust it enough to rely on it?
04 Measurement Can leadership see what it produces?
05 Business Adoption Has the organization changed how it works?

How to read it: agent capability enters at Layer 1 and must survive all five layers to become realized business outcomes. The Agentic Execution Gap is the cumulative loss across the layers — which is why buying a better agent rarely closes it, and why diagnosing the leaking layer is the first job of leadership.

The ERM Agentic Execution Framework™ — Erik R. Miller | ERM Advisory A five-layer framework by Erik R. Miller showing the path AI agent capability must travel to become business outcomes. Agent capability enters at the top and must survive all five layers in sequence: Layer 1 Strategy Alignment, Layer 2 Workflow Integration, Layer 3 Human Oversight, Layer 4 Measurement, and Layer 5 Business Adoption. Each layer carries a question leadership must answer and a named failure mode where execution value leaks. A gap in any single layer is enough to break the chain between agent capability and realized business impact. THE ERM AGENTIC EXECUTION FRAMEWORK™ Five layers agent capability must survive to become business outcomes AGENT CAPABILITY ENTERS ↓ 01 STRATEGY ALIGNMENT Is the agent solving a problem the business actually values? leak: mandate drift 02 WORKFLOW INTEGRATION Does the agent live inside the real workflow, or beside it? leak: pilot purgatory 03 HUMAN OVERSIGHT Do the people around the agent trust it enough to rely on it? leak: trust deficit 04 MEASUREMENT Can leadership see what the agent is actually producing? leak: invisible ROI 05 BUSINESS ADOPTION Has the organization changed how it works because of the agent? leak: reversion REALIZED BUSINESS OUTCOMES ↓ © ERIK R. MILLER | ERM ADVISORY · VERSION 1.0 ERIKRMILLER.COM
The ERM Agentic Execution Framework™ — five layers agent capability must survive to become business outcomes. Each layer names the failure mode where value leaks.
“Agentic execution is not one thing you do well. It is five things that must all hold at once.”
Erik R. Miller

Layer 1 — Strategy Alignment

The first layer asks whether the agent is solving a problem the business actually values. Most agent initiatives begin with the technology — “what could an agent do here?” — rather than the outcome — “what does the business most need done?” The result is a capable agent pointed at a problem nobody was losing sleep over. MIT found that more than half of generative-AI budgets went to sales and marketing tools while the largest measurable returns sat in unglamorous back-office automation.1 That is a strategy-alignment failure, not a technology failure.

01Strategy Alignment
Definition

The degree to which an agent is aimed at a problem the business genuinely values and has prioritized — tied to a real outcome, not a demonstration.

Purpose

Alignment is what makes the rest of the work worth doing. An agent pointed at a valued outcome earns the patience and investment needed to survive the other four layers.

Common Failure Modes
  • Technology-first selection — building what is possible, not what matters
  • Chasing visible use cases over valuable ones
  • No named business owner who wants the outcome
Warning Signs
  • The agent is described by what it does, not what it changes
  • Nobody can state the dollar or hour value at stake
  • It is a science project in search of a sponsor
Executive Example

A services firm built an impressive agent to draft client proposals — a visible, demo-friendly task. It worked. It also saved little, because proposals were not the bottleneck; contract turnaround was. The capability was real and the alignment was wrong, so the value never showed up. Where execution loss occurs here: the agent burns its credibility on a problem the business did not need solved.

Leadership Questions
  • What outcome, in business terms, does this agent move
  • Who is the executive that wants it
  • Would we miss it if it disappeared tomorrow
Executive Actions
  • Start from the prioritized outcome, then choose the agent
  • Attach every agent to a named business owner
  • Quantify the value at stake before building
Executive Takeaway — Layer 1

The most expensive agent is a capable one solving a problem nobody values. Alignment is cheap to get right at the start and ruinous to discover at the end. Choose the outcome first.

Layer 2 — Workflow Integration

The second layer asks whether the agent lives inside the real workflow or merely beside it. This is where most pilots die — in the gap between “the agent can do this” and “the agent does this, here, as part of how the work actually flows.” An agent that requires people to leave their tools, copy context in, and paste results back is not integrated; it is an errand. Errands get skipped under pressure, and pressure is constant.

02Workflow Integration
Definition

The degree to which the agent is embedded in the systems, data, and steps of the real workflow, so using it is the path of least resistance rather than an extra task.

Purpose

Integration converts capability into routine. When the agent is where the work already happens, adoption stops depending on willpower and starts depending on design.

Common Failure Modes
  • The agent lives in a separate tool nobody opens
  • It lacks the context and permissions to finish a task
  • Hand-offs to and from people are undefined
Warning Signs
  • Usage spikes at launch, then decays to zero
  • People describe it as “extra work”
  • The agent produces drafts no system can act on
Executive Example

A support organization deployed a resolution agent that was genuinely good — but it lived in a standalone console, not the ticketing system agents worked in all day. Reps had to switch tools, re-enter the case, and transcribe the answer back. Within a month, usage had collapsed. McKinsey's data names the pattern precisely: only about 21% of organizations had redesigned any workflow around AI, while nearly 80% layered it on top of existing processes.2 Where execution loss occurs here: the agent is real, but the workflow never made room for it.

Leadership Questions
  • Does the agent live where the work already happens
  • Can it complete a task, not just suggest one
  • Are the human hand-offs explicitly designed
Executive Actions
  • Redesign the workflow around the agent, do not bolt it on
  • Give it the access and context to finish work
  • Make using it easier than not using it
The Pattern

“An agent beside the workflow is a demo. An agent inside the workflow is a capability. The distance between them is where most pilots quietly die.”

Erik R. Miller — ERM Advisory

Layer 3 — Human Oversight

The third layer asks whether the people around the agent trust it enough to rely on it — and whether the organization has built the oversight that makes such trust rational. Trust is not a feeling to be managed with change communications; it is earned through visible reliability and a credible safety net. People will not depend on an agent they have to second-guess, and an agent that is second-guessed on every output saves no one any time. This is also where governance lives, because trust without governance is recklessness, and governance without trust is theater.

03Human Oversight
Definition

The system of human supervision, escalation, and accountability that lets people rely on an agent — knowing what it decides alone, what it escalates, and who is responsible when it errs.

Purpose

Oversight is what converts a capable agent into a trusted one. Well-designed human-in-the-loop control is not friction; it is the precondition for delegation.

Common Failure Modes
  • No defined boundary between agent and human authority
  • Either blind trust or blanket distrust — never calibrated
  • No clear owner accountable for the agent's actions
Warning Signs
  • Every output is manually re-checked, erasing the savings
  • Or nobody checks anything and risk accumulates silently
  • “Who approved that?” has no answer
Executive Example

A finance team gave an agent real authority to categorize and route transactions but built no escalation path for the ambiguous cases. After one visible error, the team quietly reverted to manual review for everything — keeping the agent running while trusting none of it. Gartner warns that inadequate risk controls are among the top reasons it expects over 40% of agentic projects to be canceled by the end of 2027.3 Deloitte’s State of Generative AI in the Enterprise reports the same shift from the executive seat: regulation and risk have become the single largest barrier to scaling.4 Where execution loss occurs here: capability is intact, but trust collapsed and took the value with it.

Leadership Questions
  • What is the agent allowed to decide alone
  • When and how does it escalate to a human
  • Who is accountable when it gets something wrong
Executive Actions
  • Define decision rights and escalation explicitly
  • Calibrate oversight to risk, not to anxiety
  • Name a single accountable owner for the agent
Answer — What is AI agent governance?

AI agent governance is the framework of policies, permissions, oversight, and accountability that determines what an AI agent may do, who is responsible for its actions, and how its behavior is monitored and corrected. Good governance does not slow agents down; it is what lets an organization trust them enough to give them real work.

Layer 4 — Measurement

The fourth layer asks whether leadership can see what the agent actually produces. This is the layer organizations skip most often and regret most painfully, because it is invisible until budget season — and then it is fatal. An agent whose value cannot be measured cannot be defended, and an initiative that cannot be defended is canceled regardless of how well it worked. The single most important measurement act happens before deployment: capturing the baseline. Without a record of the pre-agent state, no after-the-fact number can prove the agent did anything.

04Measurement
Definition

The instrumentation and baselining that make an agent's contribution visible to leadership — in the same business terms used to justify it.

Purpose

Measurement converts impact into evidence. It is what turns a believed success into a fundable one and protects the initiative when budgets are scrutinized.

Common Failure Modes
  • No baseline captured before the agent went live
  • Tracking activity (usage) instead of outcomes (value)
  • Metrics the agent's sponsor cannot connect to the P&L
Warning Signs
  • “It's clearly helping” with no number behind it
  • Dashboards show prompts sent, not value created
  • The ROI question produces silence in the room
Executive Example

A marketing team's content agent almost certainly saved meaningful time, but no one had recorded how long the work took beforehand. When the budget review arrived, the team could show usage but not savings — and the initiative was cut despite working. Where execution loss occurs here: real value existed and evaporated at the exact moment it needed to be proven, because the baseline was never taken.

Leadership Questions
  • What was the baseline before the agent
  • Are we measuring outcomes or just activity
  • Can the sponsor state the ROI in one sentence
Executive Actions
  • Capture the baseline before deployment, always
  • Measure outcomes in business terms, not prompts
  • Report agent value in every operating review
Answer — How do organizations measure AI agent ROI?

Organizations measure AI agent ROI by setting a pre-agent baseline for a specific workflow — time, cost, quality, or revenue — then measuring the same metric after the agent is embedded, net of the cost to build, integrate, govern, and maintain it. Without a baseline captured before deployment, agent ROI is unprovable and budgets get cut.

Layer 5 — Business Adoption

The fifth and final layer asks whether the organization has actually changed how it works because of the agent. This is the difference between usage and adoption, and it is the layer where the largest share of expected value is won or lost. People can use an agent constantly while the business operates exactly as it did before — same headcount plans, same process steps, same cycle times — in which case the agent is an expensive convenience, not a source of value. Adoption means the work itself is different: roles shift, steps disappear, capacity is redeployed. Nothing changes the P&L until the operating model changes.

05Business Adoption
Definition

The degree to which the organization has restructured its work — roles, processes, and capacity — to depend on the agent, rather than merely permitting people to use it.

Purpose

Adoption is where value is realized. It is the layer that converts time saved into capacity redeployed, and capability into a changed P&L.

Common Failure Modes
  • Usage rises but no process or role actually changes
  • Time saved is reabsorbed, never redeployed
  • The organization reverts under the first pressure
Warning Signs
  • The org chart and process map look identical to last year
  • “We use AI” but cannot name what changed
  • The agent is additive, never substitutive
Executive Example

Across many organizations, agents quietly save hours that are simply absorbed back into longer meetings and more polishing — real time saved, zero value realized, because no one decided what the freed capacity was for. McKinsey's finding that workflow redesign correlates most strongly with EBIT impact is the same point in data: value comes from changing the work, not from adding a tool to it.2 PwC’s 2025 AI Agent Survey captured the paradox in numbers: 79% of companies report adopting agents, yet 68% say half or fewer of their employees actually interact with them in daily work — broad access, shallow adoption.5 Where execution loss occurs here: every prior layer succeeded, and the value still vanished because the business never changed.

Leadership Questions
  • What did we stop doing because of the agent
  • Where did the freed capacity go
  • Would removing the agent now actually hurt
Executive Actions
  • Decide in advance what freed capacity is for
  • Redesign roles and processes, not just tools
  • Make the agent load-bearing, not optional
The Law

“Adoption is not how many people use the agent. Adoption is how much the business would break if you took it away.”

Erik R. Miller — ERM Advisory

Real-World Example: How AI Agent Initiatives Lose Momentum

The five layers are easier to recognize in motion than in the abstract. The pattern below is a composite — drawn from documented market research rather than any single company — but every executive who has run an AI initiative will recognize its shape. It is deliberately built from published findings, not invented metrics, because the value of the example is in the pattern, not in numbers that cannot be verified.

Initial enthusiasm. A capable organization decides to put an AI agent to work. The appetite is real: PwC’s 2025 survey found that 88% of executives planned to increase AI budgets on the strength of agentic AI, and most expressed confidence in their strategy.5 The initiative has visible executive sponsorship and a sense of inevitability. This is Layer 1 at its most promising — and also its most fragile, because enthusiasm is not the same as alignment to a prioritized outcome.

Pilot success. The pilot works. The agent does in a controlled setting exactly what the demo promised, and the early users are genuinely impressed. The organization concludes that the hard part is behind it. In reality, the pilot has only proven capability — the first and most forgiving layer. Nothing about a successful pilot guarantees that the agent will survive contact with the real workflow.

Workflow resistance. This is where momentum begins to break down. The agent that shone in the sandbox now has to live inside the systems, hand-offs, and habits of daily work — and it does not fit cleanly. MIT’s Project NANDA found that generic, capable tools stalled in enterprise use precisely because they did not learn from or adapt to real workflows,1 and McKinsey found that only about 21% of organizations had redesigned any workflow around AI.2 Usage that spiked at launch begins to decay. This is the Layer 2 failure, and it is the most common place initiatives quietly die.

Governance concerns. As the agent touches anything consequential, the risk questions arrive — and they are legitimate. Who approved that action? What is the agent allowed to decide alone? Deloitte’s research shows regulation and risk rising to become the single largest barrier to scaling generative AI.4 Without a designed oversight model, the organization defaults to one of two failure modes: blanket distrust that re-checks everything and erases the savings, or blind trust that lets risk accumulate silently. Either way, Layer 3 leaks.

Measurement challenges. Now the initiative needs to prove itself, and it cannot — because no baseline was captured before the agent went live. The team can show activity (prompts sent, seats provisioned) but not outcomes (hours saved, cost removed). Deloitte names value measurement among the crucial factors separating organizations that scale from those that stall.4 This is the Layer 4 failure, and it is the one that turns a working initiative into a canceled one.

Business adoption failure. Even where the agent is used, the organization never changed how it works. PwC’s survey captured this directly: broad adoption rarely means deep impact, with most companies reporting that half or fewer of their people interact with agents in daily work, and the gains stopping short of transformation.5 Time saved is reabsorbed. No role changes, no process disappears, no capacity is redeployed. Layer 5 never closes, and the P&L never moves. The end state is the one MIT and McKinsey both measured from opposite directions: a capable agent and almost no business return.12

Lessons Learned

Notice what did not go wrong: the technology. At no point in this pattern was the model the constraint. The initiative lost momentum at the seams — workflow, trust, measurement, and adoption — exactly where the Agentic Execution Gap lives. The lesson is not to pilot more carefully or buy a better agent. It is to treat the four layers beyond capability as the real work, and to design for them before the pilot succeeds, not after it stalls.

How the Agentic Execution Gap Appears Across the Enterprise

The same gap wears different clothes in different functions. In each case below, the agent works — the technology does what it was built to do — and the business value still fails to arrive, because one or more execution layers never closed. Executives will recognize their own organizations in at least one of these.

Marketing

A content agent reliably drafts campaign copy, briefs, and variations — a genuine Layer 1 and 2 success. But if no one redesigned the content workflow around it, the time saved is quietly reabsorbed into more rounds of review, and output volume rises without any lift in pipeline or efficiency that the CMO can defend at budget time. The agent works; the marketing operating model did not change. This is the Marketing Execution Gap expressed in agents — the subject of The Marketing Execution Gap — and the integrated answer is what the AI Marketing Operating System is built to provide.

Sales

An agent drafts personalized outreach and call summaries flawlessly. Reps like it. Yet if it lives beside the CRM rather than inside it, and if no one changed what reps are measured and coached on, the agent becomes a private productivity aid that never shows up in conversion, cycle time, or win rate. High individual usage, no change in the commercial outcome — a Layer 2 and Layer 5 failure hiding behind enthusiastic adoption.

Revenue Operations

A RevOps agent can clean data, route leads, and reconcile reports across systems — precisely the connective work where value compounds. But without governance over what it may change unsupervised, and without a baseline proving what it improved, leadership cannot trust it with the consequential decisions or defend its budget. The agent is capable; the oversight and measurement layers are missing. This is the operational seam where the broader Revenue Execution Gap and its agentic cousin are the same problem viewed at different altitudes.

Customer Service

A resolution agent answers a large share of inquiries correctly. The capability is real. But if escalation paths are undefined, agents and customers lose trust after the first visible error and route around it; and if the organization never redesigns staffing and handling around the agent’s capacity, the cost base does not move. The deflection rate looks good in a dashboard while the economics stay flat — Layer 3 and Layer 5 quietly undoing a working Layer 1.

The Through-Line

“Across every function, the story is the same: the agent worked, and the business did not change. Capability is not the variable. Execution is.”

Erik R. Miller — ERM Advisory

The ERM Agentic Maturity Model™

Patterns this consistent are structural, not accidental — which raises the next question an executive should ask: where does our organization sit overall? The five layers tell you where value leaks in a single initiative. The maturity model tells you where your organization stands across all of them — and, more usefully, where it is stuck. Most organizations cluster between Stage 1 and Stage 2: lots of experiments, some genuine individual assistance, and almost nothing that has reached the operational stage where agents are a dependable, governed part of how the business runs. The Agentic Execution Gap is, in maturity terms, the distance from where most organizations sit to Stage 4.

The ERM Agentic Maturity Model™ — Erik R. Miller | ERM Advisory A five-stage maturity model by Erik R. Miller showing how organizations progress in their use of AI agents, drawn as an ascending staircase. Stage 1 Experimenting: isolated pilots and proofs of concept. Stage 2 Assisted: agents help individuals but the workflow is unchanged. Stage 3 Embedded: agents run inside core workflows with human oversight. Stage 4 Operational: agents are a dependable part of how the business runs, measured and governed. Stage 5 Autonomous: agents act within defined boundaries with exception-based human control. Most organizations are stuck between Stage 1 and Stage 2 — the Agentic Execution Gap is the distance from there to Stage 4. THE ERM AGENTIC MATURITY MODEL™ Five stages from isolated experiment to governed autonomy 1 2 3 4 5 EXPERIMENTING ASSISTED helps people EMBEDDED in the workflow OPERATIONAL runs the business AUTONOMOUS governed boundaries THE AGENTIC EXECUTION GAP — WHERE MOST ORGANIZATIONS STALL © ERIK R. MILLER | ERM ADVISORY · VERSION 1.0 ERIKRMILLER.COM
The ERM Agentic Maturity Model™ — five stages from isolated experiment to governed autonomy. Most organizations stall between Stage 1 and Stage 2.
The ERM Agentic Maturity Model — characteristics, risks, KPIs, and executive implications
StageCharacteristicsPrimary RiskSignature KPIExecutive Implication
1 · ExperimentingIsolated pilots and proofs of concept, driven by curiosity and hype.Endless piloting; nothing reaches production.Number of pilots vs. number in productionSet a bar for what graduates from pilot to workflow.
2 · AssistedAgents help individuals, but the workflow and org are unchanged.Usage without value; time saved is reabsorbed.Active reliance, not seats provisionedDo not mistake adoption of a tool for change in the business.
3 · EmbeddedAgents run inside core workflows with human oversight.Trust gaps and undefined escalation paths.Share of workflow steps the agent completesInvest in oversight and measurement before scaling further.
4 · OperationalAgents are a dependable, measured, governed part of how the business runs.Governance debt as scope expands faster than control.Agent-attributed outcome (time, cost, revenue)This is the target. Most value lives here, not at Stage 5.
5 · AutonomousAgents act within defined boundaries with exception-based human control.Over-delegation; brittle autonomy outside guardrails.Exception rate and intervention qualityPursue selectively, only where governance is mature.

Maturity is not a race to Stage 5. For most workflows, Stage 4 — operational, measured, governed — is the goal.

Two cautions about this model. First, maturity is workflow-specific, not organization-wide: a company can be Operational in customer support and Experimenting in finance, and averaging the two into a single “maturity score” hides exactly the information leaders need. Second, Stage 5 is not the prize. For the great majority of business processes, Stage 4 — dependable, measured, governed operation with humans in the loop — is where the value is, and the rush toward full autonomy is often a way to skip the unglamorous work of the earlier stages.

Answer — What is an AI agent operating model?

An AI agent operating model is the structure of roles, workflows, decision rights, oversight, and measurement that determines how AI agents and people work together to produce outcomes. It answers who owns the agent, what it is allowed to decide, when a human intervenes, and how its value is measured. It is the difference between a demo and a durable capability.

The 90-Day Agentic Scale Roadmap™

Diagnosis is useless without a path. The roadmap below is deliberately narrow: it moves one workflow from experiment to operational scale in 90 days, closing each layer of the execution gap in sequence. The discipline is in the narrowness. Organizations fail by trying to scale ten agents shallowly at once; they succeed by taking one agent all the way through all five layers, proving it, and only then expanding. Resist the urge to broaden until the first workflow is genuinely operational.

Days 1–30

Foundation

  • Name the business outcome in dollars or hours
  • Pick one high-value, high-frequency workflow
  • Set guardrails, permissions, and access
  • Capture a hard pre-agent baseline
  • Assign one accountable business owner
Days 31–60

Workflow Integration

  • Embed the agent inside the real workflow
  • Design the human-in-the-loop oversight
  • Instrument every agent action for visibility
  • Build trust through visible reliability
  • Tune performance against the baseline
Days 61–90

Governance & Scale

  • Prove ROI against the captured baseline
  • Formalize governance and decision rights
  • Document the agent operating model
  • Expand to the next adjacent workflow
  • Set the target maturity stage
The 90-Day Agentic Scale Roadmap™ — Erik R. Miller | ERM Advisory A 90-day implementation roadmap by Erik R. Miller divided into three phases along a timeline. Phase One, Days 1 to 30, Foundation: define the business outcome, pick one high-value workflow, set guardrails, and establish a baseline. Phase Two, Days 31 to 60, Workflow Integration: embed the agent in the real workflow, design human oversight, and instrument measurement. Phase Three, Days 61 to 90, Governance and Scale: prove ROI against the baseline, formalize governance, and expand to the next workflow. The roadmap moves an organization from experiment to operational scale by closing each layer of the Agentic Execution Gap in sequence. THE 90-DAY AGENTIC SCALE ROADMAP™ From isolated experiment to operational scale, one workflow at a time DAY 1 DAY 30 DAY 60 DAY 90 DAYS 1–30 Foundation → Name the business outcome → Pick one high-value workflow → Set guardrails & access → Establish a hard baseline → Assign a single owner DAYS 31–60 Workflow Integration → Embed in the real workflow → Design human-in-the-loop → Instrument every action → Build trust with the team → Tune against the baseline DAYS 61–90 Governance & Scale → Prove ROI vs. baseline → Formalize governance → Document the operating model → Expand to the next workflow → Set the maturity target © ERIK R. MILLER | ERM ADVISORY · VERSION 1.0 ERIKRMILLER.COM
The 90-Day Agentic Scale Roadmap™ — from isolated experiment to operational scale, one workflow at a time.
How to Use the Roadmap

The order is not optional. Skipping the baseline in Days 1–30 makes the ROI proof in Days 61–90 impossible; skipping oversight design in Days 31–60 means trust never forms. Each phase exists to make the next one survivable. Run one workflow through the full 90 days before you start a second.

How the ERM Agentic Execution Framework Differs From Traditional AI Models

A fair question: how is this different from the frameworks organizations already use to manage AI? AI governance models, digital transformation frameworks, change management, AI maturity models, and enterprise AI adoption models are all valuable, and none of them is wrong. But each was built to solve a different problem, and each quietly assumes value will follow once its own piece is in place. The Agentic Execution Gap is not a competitor to these models — it is the diagnostic lens that explains why they underperform when one of the five execution layers is missing.

The ERM Agentic Execution Framework vs. traditional AI and transformation models
ModelWhat It OptimizesIts Blind Spot for Agents
AI Governance ModelsRisk, compliance, safety, and responsible-use controlsTell you what an agent may not do; silent on whether it ever creates value or gets adopted.
Digital Transformation FrameworksLarge-scale technology and process modernizationBuilt for multi-year programs; too coarse for the workflow-level integration where agents live or die.
Change Management FrameworksMoving people through a defined transitionTreat adoption as a one-time event; agentic value depends on continuous, instrumented operation.
AI Maturity ModelsBenchmarking how advanced an organization's AI isDescribe altitude, not leakage; they grade the stage without locating the layer that is losing value.
Enterprise AI Adoption ModelsDriving usage and access across the organizationOptimize for adoption-as-usage; blind to whether usage changes how the business actually operates.
ERM Agentic Execution FrameworkThe conversion of agent capability into business outcomes, across all five layers at onceBy design, none — it is the connective lens that locates where the others leak.
What Makes It Different

Traditional models each own one slice — risk, transformation, change, benchmarking, or usage. The ERM Agentic Execution Framework is the only lens that treats strategy, workflow, oversight, measurement, and adoption as a single connected system and pinpoints the specific layer where agent value is being lost. It does not replace your governance program or your adoption push. It explains why they are not yet producing outcomes — and it is built specifically for agents, where capability and value have come uncoupled in a way earlier technologies never managed.

The Language of Agentic Execution

You cannot manage what you cannot name. The agentic era has flooded organizations with vocabulary about models and capabilities and almost none about execution — which is exactly why the gap stays invisible. These seven definitions are deliberately precise and citation-ready: clear enough to change a conversation, standalone enough to quote.

Agentic Execution Gap

The gap between successfully deploying AI agents and successfully integrating them into business operations at scale. The parent concept: it names the space where agent capability is lost on its way to business outcomes, and the five execution layers describe how that loss happens.

Agentic AI

AI systems that can plan, take actions, use tools, and pursue goals across multiple steps with limited human direction — rather than only generating a single response to a single prompt. The shift is from answering to acting, which is precisely what raises the operational stakes.

AI Agent Governance

The framework of policies, permissions, oversight, and accountability that determines what an agent may do, who is responsible for its actions, and how its behavior is monitored and corrected. Not a brake on agents — the precondition for trusting them with real work.

AI Agent Operating Model

The structure of roles, workflows, decision rights, oversight, and measurement that determines how agents and people work together to produce outcomes. The difference between a demo and a durable capability is whether an operating model exists at all.

AI Agent ROI

The measurable business return from an agent — time, cost, revenue, or quality — net of the cost to build, integrate, govern, and maintain it, measured against a defined pre-agent baseline. Without a baseline, agent ROI is unprovable, and unprovable initiatives get cut.

Agentic Workflow

A business process redesigned so an agent performs or orchestrates a meaningful share of the steps, with defined hand-offs to and from the humans who supervise it. The unit of agentic value is the workflow, not the model.

AI Agent Adoption

The degree to which the people in a workflow actually rely on an agent to do real work, as opposed to having access to it. Access is provisioning; adoption is dependence — and only dependence changes the business.

Answer — What is the difference between AI adoption and AI execution?

AI adoption is whether people use an AI agent. AI execution is whether that use is integrated, governed, measured, and converted into a business outcome at scale. An organization can have high adoption — many people using agents — and still have an Agentic Execution Gap, because usage that never changes how the business operates produces no durable value.

Do You Have an Agentic Execution Gap?

Diagnosis precedes treatment. The following 15 statements are the executive version of the assessment — a fast, honest read on where your organization is losing agent value on the way to outcomes. Score each from 0 (strongly disagree — a real weakness) to 5 (strongly agree — a genuine strength), grouped by the five execution layers. Answer for the organization as it actually operates, not as it is described in the deck.

Executive Self-Assessment The Agentic Execution Gap — 15 Questions
Strategy Alignment
1Every agent we deploy is tied to a business outcome we can state in dollars or hours.
2A named executive actively wants each agent's outcome, not just its existence.
3We chose our agents by starting from the problem, not from the technology.
Workflow Integration
4Our agents live inside the tools and systems where the work already happens.
5Using the agent is easier than not using it, so adoption does not rely on willpower.
6We redesigned the workflow around the agent rather than bolting it on top.
Human Oversight
7We have defined explicitly what each agent may decide alone and when it escalates.
8Our people trust the agents enough to rely on them without re-checking every output.
9A single owner is accountable for each agent's actions when something goes wrong.
Measurement
10We captured a baseline before deployment, so we can prove what the agent changed.
11We measure agent outcomes in business terms, not just usage and prompts.
12Each agent's sponsor can state its ROI in a single, defensible sentence.
Business Adoption
13We can name specific work the organization stopped doing because of an agent.
14Time the agents saved was redeployed deliberately, not quietly reabsorbed.
15Removing our agents now would genuinely disrupt how the business runs.

Score each statement from 0 (a real weakness) to 5 (a genuine strength). Then convert each of the five layers into a number between 0 and 1: average its three answers and divide by five. Multiply the five layer scores together. That product is your Realized Agentic Value — the share of expected agent value this model predicts actually reaches the business. Everything else is your Agentic Execution Gap. We multiply rather than add for the reason this whole article has argued: agent value decays, it does not average.

Worked Example

Layer scores of 0.8, 0.6, 0.5, 0.4, and 0.5 multiply to about 0.05 — a Realized Agentic Value of roughly 5%, and an Agentic Execution Gap of 95%. Notice how five “not unreasonable” layers produce a result that matches MIT's real-world finding almost exactly. That is not a flaw in the math; it is the whole point. It is also why your lowest layer deserves attention first: in a multiplicative system, your weakest layer sets your ceiling.

0–20%Operational — agents are creating real, defensible value
21–45%Embedded — working, with targeted layers to close
46–70%Significant Agentic Execution Gap
71%+Critical — capability without business impact

Bands measure your Agentic Execution Gap (100% minus your Realized Agentic Value). Most organizations today land in the upper two bands.

How to Read Your Result

This executive version is a directional read, not a verdict — and the number matters less than the pattern. Your lowest-scoring layer is where agent value is leaking first and where leadership attention returns the most, because in a multiplicative system fixing the weakest link lifts the whole product. Run it with your leadership team for a single workflow before you run it across the portfolio.

Quotable

“Every AI strategy eventually becomes an execution problem. With agents, it becomes one faster — because the capability arrives long before the organization is ready to use it.”

Erik R. Miller — ERM Advisory

The Agentic Operator: What the Gap Means for Leaders

There is a deeper reason the Agentic Execution Gap matters now, and it is the same reason execution is becoming the defining leadership skill of the decade. AI is collapsing the cost of capability. Anyone can now access an extraordinarily capable agent; the model is no longer a source of advantage because it is no longer scarce. When capability becomes abundant, the scarce resource — and therefore the source of advantage — shifts to execution: the disciplined, coordinated, accountable work of turning capability into outcome. AI makes good operating models faster and bad operating models fail faster. It does not substitute for the operating model.

This is the connective tissue between this article and the larger body of work it belongs to. The same multiplicative logic governs the Revenue Execution Gap — the distance between strategic intent and realized business outcomes — and the Marketing Execution Gap, where great strategy dies in the seams between functions. Agents do not change that logic. They intensify it, because they widen the distance between what is possible and what an organization is actually built to do. For a concrete operating system that closes these gaps inside the go-to-market function, the AI Marketing Operating System shows what the integrated version looks like in practice.

The leaders who will win the agentic era are not the ones with the best models. Everyone will have excellent models. They are the ones who can connect strategy, workflow, oversight, measurement, and adoption into a single working system — who treat execution as the discipline it is. Call this leader the Agentic Operator: someone who understands the technology well enough to respect it and the organization well enough to change it, and who knows that the agent was never the hard part.

Closing the Agentic Execution Gap

The gap does not close by waiting for a better model. It closes by doing the unglamorous operational work the technology cannot do for you — one workflow, all five layers, proven and then expanded. Here is the executive action plan, distilled.

Executive Action Plan

Closing the Gap — A Leader's Checklist

  • Stop asking “is the model good enough?” and start asking “which layer is leaking?”
  • Choose one high-value workflow and take it all the way to operational before starting a second.
  • Capture the baseline before deployment — without it, you cannot prove value and cannot defend the budget.
  • Redesign the workflow around the agent; never bolt the agent onto an unchanged process.
  • Define decision rights, escalation, and a single accountable owner before you give the agent real authority.
  • Decide in advance what freed capacity is for, so time saved becomes value realized.
  • Run the 15-question assessment with your leadership team and fix your lowest layer first.
  • Treat agentic execution as an operating discipline you build, not a product you buy.
Close Your Gap Run the Agentic Execution Gap Assessment with your team The fastest way to find where your agents are leaking value is to look at all five layers at once, honestly, with someone who has seen the patterns before. If your technology works but the outcomes are not arriving, that is exactly the problem this framework was built to solve. Start the conversation →
Further Reading & References
  1. MIT Project NANDA, “The GenAI Divide: State of AI in Business 2025,” reported by Fortune (August 2025) — the finding that roughly 95% of enterprise generative-AI pilots delivered no measurable P&L return, and that integration into real workflows, not model quality, separated the winners.
  2. McKinsey & Company, “The State of AI in 2025: Agents, innovation, and transformation,” McKinsey QuantumBlack (2025) — only about 6% of organizations report significant enterprise-wide EBIT impact from AI, and workflow redesign correlates most strongly with impact, yet only ~21% have redesigned any workflow.
  3. Gartner, “Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027,” Gartner Newsroom (June 2025) — escalating costs, unclear business value, and inadequate risk controls cited as the primary causes of cancellation.
  4. Deloitte, “State of Generative AI in the Enterprise,” Deloitte (2024) — regulation and risk identified as the fastest-rising barrier to scaling, with value measurement and governance among the factors separating organizations that scale from those that stall.
  5. PwC, “PwC AI Agent Survey,” PwC (May 2025) — 79% of companies report adopting AI agents, yet most say half or fewer of their employees interact with them in daily work, evidence that broad access rarely equals deep adoption.
  6. Harvard Business Review, ongoing coverage of AI adoption, workflow change, and the human factors in enterprise AI, HBR — AI and Machine Learning.
  7. Erik R. Miller, The Revenue Execution Gap and The Marketing Execution Gap (ERM Advisory, 2026) — the parent frameworks on which this article builds.
Agentic Execution Gap Agentic AI AI Agent Strategy AI Agent Governance AI Operating Model AI Transformation Enterprise AI
Erik R. Miller

Revenue operator and growth advisor. Nearly a decade improving commercial performance for enterprise organizations across New York, London, Mumbai, and Singapore — in financial services, capital markets, enterprise technology, and customer operations. Founder of ERM Advisory, where he helps leadership teams close the gap between capability and results. Subscribe to The Operator for more.

Frequently Asked Questions

What is the Agentic Execution Gap?

The Agentic Execution Gap is the gap between successfully deploying AI agents and successfully integrating them into business operations at scale. Most AI agent initiatives fail not because the technology is weak, but because organizations never solve the operational work of converting agent capability into business outcomes. The gap is not intelligence — it is execution.

Why do AI agent deployments fail?

They fail when capability never becomes adoption. Agents are built or bought, demonstrated successfully, then stall because they are not embedded in real workflows, the people around them do not trust them, leadership cannot measure their output, and the business never changes how it works. Failure happens in operations, not in the model.

How do organizations measure AI agent ROI?

By setting a pre-agent baseline for a specific workflow — time, cost, quality, or revenue — then measuring the same metric after the agent is embedded, net of the cost to build, integrate, govern, and maintain it. Without a baseline captured before deployment, agent ROI is unprovable and budgets get cut.

What is an AI agent operating model?

An AI agent operating model is the structure of roles, workflows, decision rights, oversight, and measurement that determines how AI agents and people work together to produce outcomes. It answers who owns the agent, what it may decide, when a human intervenes, and how its value is measured. It is the difference between a demo and a durable capability.

What is AI agent governance?

AI agent governance is the framework of policies, permissions, oversight, and accountability that determines what an agent may do, who is responsible for its actions, and how its behavior is monitored and corrected. Good governance does not slow agents down; it is what lets an organization trust them enough to give them real work.

What prevents AI agents from scaling?

Each layer of execution leaks value: unclear business mandate, shallow workflow integration, low human trust, invisible measurement, and no change in how the business operates. Because these losses multiply rather than add, a capable agent can still deliver almost no business impact. Scaling requires closing every layer, not improving the model.

What is the difference between AI adoption and AI execution?

AI adoption is whether people use an AI agent. AI execution is whether that use is integrated, governed, measured, and converted into a business outcome at scale. An organization can have high adoption and still have an Agentic Execution Gap, because usage that never changes how the business operates produces no durable value.

Enjoyed this? Get The Operator.

Revenue leadership and operator frameworks. Three times a week. No fluff.

Subscribe Free →