AI Business Strategy

Cleared for takeoff? Why AI agents fail before they leave the runway

By Kirsty Biddiscombe, EMEA Business Lead AI, ML & Data Analytics

Over the past year, AI agents have been positioned as autonomous operators capable of planning, reasoning and executing complex tasks with limited supervision. For many organisations, they represent the next phase of enterprise automation. They are systems that can act, rather than simply respond. 

Unfortunately, in practice, many agents never truly get airborne. Research suggests that the majority of agents (65%) fail to complete multi-step tasks reliably. And in many cases, the issue is not a lack of horsepower but the runway conditions. This is because many organisations need to lay the groundwork first. Without engineered infrastructure, defined operating procedures and disciplined data management, the very agents that are designed to simply and streamline operations end up complicating them.   

Prepping for flight  

A common deployment mistake is assuming that once an agent is configured, it can just be assigned a task and expected to perform. In reality, most enterprise data environments are poorly suited for AI agents as they are. Processes are undocumented, data ownership is unclear, decision logic may be uncodified, and hand-offs between teams can often be inconsistent. Humans may fill these gaps with experience and contextual judgement. Agents, however, cannot.  

When workflows are loosely defined, the resulting ambiguity is where agents start to make mistakes. In a finance environment, for example, an agent tasked with processing vendor invoices may need to validate purchase orders, check budget thresholds, escalate exceptions and update ERP records. If those approval rules are undocumented or inconsistently applied across business units, the agent cannot infer this. As a result, it will either reject valid invoices or approve ones that should have been escalated.  

This is why before an agent is deployed, organisations must map out workflows end-to-end, defining decision thresholds and parameters, clarifying success metrics and establishing clear transition points between human oversight and automated execution. Additionally, failure modes must be anticipated in advance. Orchestration layers, fallback logic and intervention triggers should be designed as part of the initial architecture. An agent cannot operate reliably inside an undocumented system.  

With these foundations in place, we can turn our attention to the other major barrier for AI agents: the data. 

More pilots might not make a better flight  

When agents underperform, there is a very understandable instinct to increase their access to data. There is a certain logic that more documents, more logs and more dashboards may help because we assume that broader visibility will correct errors. 

In practice, excessive and uncurated data introduces noise into the system, and may compound agent underperformance. AI agents require clarity and redundant files, outdated documentation and low-confidence sources obscure it. The result is inconsistent outputs, hallucinations or flawed task execution. 

Pilots are not presented with the weather conditions covering the planet for context before they take off. Instead, they are given the instruments required for the specific flight. The same principle applies here. Effective agent deployment requires disciplined data curation. To be completely clear, an AI agent for HR queries doesn’t need to be fed marketing campaign drafts. Similarly, a procurement agent should not be trained on employee performance reviews. 

Organisations must be highly selective with the data they train AI agents. In turn, that requires identifying authoritative sources of data, cleaning these up, and then maintaining these databases. At the end of the day, it’s precision, not abundance, that improves performance. 

Don’t forget the control tower for trust and oversight  

Even when workflows are defined and inputs are curated, a final barrier remains: confidence. Leaders may hesitate to rely on agentic systems when they cannot see how decisions are made or trace the origin of the underlying data.  

Such opacity restricts adoption. Without visibility into data lineage, lifecycle management and decision pathways, automated systems remain difficult to govern. This is especially important when deploying agents in highly regulated industries, like healthcare or financial services. The inability to reconstruct how an agent reached a decision, including which documents were retrieved and which tools were invoked, creates immediate audit and compliance challenges.  

To achieve this, robust data management and governance frameworks are non-negotiable. End-to-end traceability means automated outputs go from opaque into auditable processes. In aviation, air traffic control provides coordination, visibility and intervention authority. It does not fly the aircraft, but it ensures safe operation within a controlled system. Similarly, governance structures do not replace agent autonomy but they do they make it viable at scale. 

Towards airworthiness 

AI agents do not fail because they lack sophistication. They fail because they are introduced into environments that aren’t naturally prepared for autonomous operation. Organisations that treat agents as self-contained tools will continue to experience stalled takeoffs. Those that treat deployment as the final step in a broader systems engineering exercise will see sustained performance.  

Ultimately, autonomy is not a feature you switch on. It is a capability to engineer. When the runway is level, the instruments calibrated and the control tower in place, AI agents can finally do what they were designed to do: takeoff properly and operate with confidence. 

Author

Related Articles

Back to top button