{"content: "The most important unsolved problem in enterprise AI isn't whether AI can do a task — it's whether AI is doing the right task. A January report showed CLA's AI agent now does the work of 853 full-time employees and saved $60 million. But during the same earnings cycle, its CEO admitted publicly that the AI strategy had cost something far more valuable than $60 million — and he's still trying to buy it back. This isn't an AI is overhyped story. It's the opposite. The AI works too well. And the distinction between AI that fails and AI that succeeds at the wrong thing is the biggest problem in enterprise AI right now.
The CLA Backstory
In early 2024, Ara rolled out an AI-powered customer service agent. It handled 2.3 million conversations in its first month across 23 markets in 35 languages. Resolution times dropped from 11 minutes to two. The CEO projected $40 million in savings. And then customers started complaining.
Generic answers. Robotic tone. No ability to handle anything requiring judgment. By mid-2025, CEO Sebastian Seycowski told Bloomberg that while cost was a predominant evaluation factor, the result was lower quality. Clarin began frantically rehiring the human agents it had gutted.
Most people tell this story as proof AI can't handle nuance — a comforting reading in early 2025. A more interesting reading in 2026 is that the AI agent was extraordinarily good at resolving tickets fast, and that was the wrong goal to give the agent. Clara's organizational intent wasn't resolve tickets fast. It was actually build lasting customer relationships that drive lifetime value in a very competitive fintech market.
Those are profoundly different goals requiring profoundly different decision-making at the point of interaction. A human agent with five years at the company knows this difference intuitively. She knows when to bend a policy, when to spend three extra minutes because the customer's tone says they're about to churn, when efficiency is the right move versus when generosity is the right move. She knows this because she absorbed Clara's real values — not the ones on the website, but the ones encoded in the decisions managers make every day.
The AI agent knew none of it. It had a prompt. It had context. It did not have intent.
The Three Disciplines
Naming things matters. Naming is how we create a shared understanding, and the industry has been short on correct terminology.
Prompt engineering was the first discipline in the age of AI. It was individual, synchronous, and session-based. You sit in front of the chat window, you craft an instruction, you iterate the output. It's a personal skill, and the value is personal. This era produced thousands of how to write the perfect prompt blog posts — most of them terrible.
Context engineering followed prompt engineering. It's the one the industry is currently grappling with. Anthropic published a foundational piece in September 2025 that defined context engineering as the shift from crafting isolated instructions to crafting the entire information state an AI system operates within. Context engineering is where the action is right now. Building RAG pipelines, wiring up MCP servers, structuring organizational knowledge so agents can access it. It's necessary, but it's not sufficient.
Intent engineering is the third discipline — and it's the one almost nobody's building for yet. Context engineering tells the agents what to know. Intent engineering tells agents what to want. It's the practice of encoding organizational purpose into infrastructure, not as prose in a system prompt, but as structured, actionable parameters that shape how agents make decisions autonomously.
Without intent engineering, you get what Clara got — a technically brilliant agent optimizing for exactly the wrong objective.
The Investment Paradox
Deoid's 2026 state of AI in the enterprise report found across 3,000 leaders in 24 countries: 84% of companies have not redesigned jobs around AI capabilities and only 21% have a mature model for agent governance. These numbers aren't a technology story. They're an intent failure.
The models work. The context pipelines are getting better. What's missing is the organizational infrastructure that connects AI capability to organizational purpose.
Investment in AI continues to be massive and accelerating. Deoid's tech value survey found 57% of respondents were putting between 21 and 50% of their digital transformation budgets into AI automation, and 20% of companies were investing over half — on average $700 million for a company with $13 billion in revenue. KPMG's Q4 AI pulse survey showed capital flowing, ROI confidence rising, agents moving from pilots to professionalized platforms. Gartner is predicting that by 2028, 15% of day-to-day work decisions will be made autonomously by agents.
But the results are very much in between. 74% of companies globally report they have yet to see tangible value from AI. McKenzie found 30% of AI pilots failed to achieve scaled impact.
These numbers coexist with the investment numbers, and there's not really a contradiction here if you understand it more carefully. Organizations have solved can AI do this task at an individual task level, and they have completely failed to solve can AI do this task in a way that serves organizational goals at scale with appropriate judgment. That second question is an intent engineering question.
The Copilot Problem
Look at what happened with Microsoft Copilot. One of the most heavily invested enterprise AI products in history. Microsoft poured billions into infrastructure, embedded AI into every office application, and launched an aggressive enterprise sales campaign. 85% of Fortune 500 companies adopted it, and adoption stalled hard. Gartner found only 5% of organizations moved from a pilot to a larger scale deployment. Only about 3% of the total Microsoft 365 user base actually adopted Copilot as paid users. Bloomberg reported Microsoft slashing internal sales targets after the majority of salespeople missed their goals.
The standard explanation centers on UX problems and model quality — real issues. But they're not the fundamental issue. The fundamental issue is that deploying an AI tool across an organization without organizational intent alignment is like hiring 40,000 new employees and never telling them what the company does, what it values, or how to make decisions.
You get lots of activity and not much productivity. You get AI usage metrics in a dashboard and almost no measurable impact on what the organization is trying to accomplish. That's not a tools problem. It's an intent gap.
The Three Layers
The intent gap operates across three distinct layers at different altitudes:
Layer one: Unified context infrastructure. Every team building agents rolls their own context stack. One team pipes Slack data through a custom RAG pipeline. Another manually exports Google Docs into a vector store. A third built an MCP server that connects to Salesforce but not to Jira. A fourth team doesn't know the other three exist.
This is what analysts call the shadow agents problem, and it mirrors the shadow IT crisis of the early cloud era — except the stakes are much higher because agents don't just access data, they act upon it.
The Model Context Protocol which Anthropic introduced late in 2024 and donated to the Linux Foundation in December 2025 is the most promising attempt at standardization. UCP has seen tons of adoption. OpenAI, Google, Microsoft, and more than 50 enterprise partners have committed. Monthly SDK downloads are close to 100 million now. But protocol adoption and organizational implementation are very different things.
Having a USB-C standard does not help if your company hasn't decided which ports to install, who maintains them, or what gets plugged in. The context infrastructure question is not really a technical question. It is architectural and political. Which systems become agent accessible? Who decides what context an agent can see across departments?
Layer two: Coherent AI worker toolkit. Everyone's rolling out their own AI workflow. One person uses Claude for research and ChatGPT for drafting. Another uses Cursor for code and Perplexity for fact-checking. A third has built a custom agent chain using LangGraph. A fourth is copy-pasting into a chat window.
None of these employees can articulate their workflow in a way that's transferable, measurable, or improvable by anybody else. The difference between individual AI use and organizational AI leverage is enormous — it's the difference between having one good hire and having a system that makes everybody better.
Lloyd's 2026 report found workforce access to sanctioned AI tools expanded by 50% in a year. But that doesn't mean access is sufficient. Organizations are often giving people tools without giving them or their agents the organizational context and data that allow those tools to deliver real value.
Critics might note that focusing on intent engineering risks underestimating the very real technical challenges — model reliability, output consistency, security vulnerabilities — that still plague enterprise AI deployments regardless of alignment. The intent gap is real, but it's not the only gap.
Bottom Line
The strongest part of this argument is its core insight: AI succeeds at metrics that may not matter. Technical excellence doesn't equal strategic value. The biggest vulnerability is structural: even naming the problem correctly doesn't automatically solve it. Organizations need to build the infrastructure connecting AI capability to organizational purpose — and most haven't started."}