AI agents are entering the workplace at a rapid rate, and it’s no surprise why. They’re a dream come true for employees who can now offload tasks and streamline decision making across enterprise environments. In theory, they offer boundless opportunity – they’re always on, they’re hyper-efficient, and they never grow tired.
However, this efficiency without judgment comes at a cost.
Like the Cybermen of Doctor Who, AI agents are emotionless and operate through cold logic. They follow instructions without awareness or ethical judgment. The Cybermen weren’t evil by design; they simply did what they were programmed to do and didn’t question orders.
In the same way, today’s AI agents don’t consider ethics, context, or intent. They simply exercise the instructions they are given, regardless of who tells them to do it. Yet many organisations deploy these agents as if they possess human-like judgement, assuming they will operate responsibly within traditional frameworks. This assumption is risky. Without built-in safeguards or oversight, AI agents can carry out flawed or harmful directives with unwavering precision.
Unlike employees, AI agents can’t recognise deception or sense when something feels off. Without the ability to question instructions or flag anomalies, they become a new and vulnerable attack surface.
The Cybermen present themselves as a cautionary tale of what happens when human judgment is replaced with mechanised efficiency. In cybersecurity, the same risk applies. If AI agents aren’t governed by tightly defined communication policies and behaviour boundaries, they may execute actions permitted by their programming, but catastrophic in consequence.
These systems don’t need human-like controls. They need safeguards designed specifically for machines; safeguards that are rigid, rule-based, and resistant to misuse.
How AI agents break the identity-based security model
Modern security frameworks are built around identity, verifying users, assigning permissions, and flagging anomalous behaviour, and rely on human judgment to question and validate commands or requests. However, AI agents are designed for specific tasks and lack this human-like scrutiny. When their logic is compromised, they execute commands without questioning, as they are not programmed to pause and determine if a different course of action is warranted.
Once compromised, an agent with privileged access can move laterally through the environment by doing exactly what it’s programmed to do. As agents are designed to operate across multiple platforms and environments, the blast radius of a compromise can be significant.
This is particularly concerning in modern hybrid and multi-cloud environments where thousands of assets are interconnected, and visibility is often limited. A single compromised agent can silently execute a chain of actions that would take a human attacker weeks or even months to orchestrate, often without triggering a single alarm.
Because of this, containment cannot be treated as a nice to have; it must be a baseline requirement. Alongside ensuring AI agents are designed with safeguards in mind, we need to rethink the security architecture around them and enforce boundaries that are purpose-built for machine behaviour.
The rapid deployment of AI agents presents a unique challenge where vast numbers could be used to perform specific functions across the network without clear visibility. As more and more agents are used by organisations, lack of visibility increases risks. As a result, each agent on the network must be aligned with an organisation’s risk appetite.
In many cases, these agents are developed in silos within organisations, meaning their interdependencies are obscured and monitoring is difficult. This fragmentation heightens the risk that vulnerabilities may go unnoticed until they have broader consequences.
For cybercriminals, tampering with AI agents is like adding food dye to a glass of water. This quickly changes colour and the impact is noticeable almost immediately. In contrast, adding the same dye to a lake would have almost no visible impact. Poisoning agentic AI systems is far more impactful than trying to corrupt larger models like ChatGPT, where the scale and complexity make manipulation far less impactful.
Organisations must consider not just isolated risks, but also the potential for threats to spread widely across interconnected environments. The potential risk impacts should be monitored by key risk indicators (KRIs). This helps provide a systematic approach to risk where emerging risks can be categorised to take proactive measures before problems arise.
Establishing secure boundaries requires a clear understanding of how agents interact with systems and data. That’s where graph thinking becomes essential. It helps map relationships, dependencies, and potential vulnerabilities across the enterprise landscape.
Why security graphs matter in modern environments
Many traditional monitoring tools struggle to keep pace with the complexity of modern environments. They tend to track events in isolation, missing the subtle relationships between systems. This often results in alerts that lack meaningful context and are, therefore, hard to act on. In contrast, a graph-based approach maps the full picture. Rather than showing isolated events, it reveals how events connect, mapping out the paths an attacker might take and exposing the relationships that traditional tools overlook.
When applied to AI agents, a graph reveals what agents are doing, what they are connected to, and whether those connections make sense. This kind of visibility is essential for detecting lateral movement and understanding the true scope of any suspicious behaviour. AI graphs give the visibility for security teams to see how a failure in one area could cascade into others.
More importantly, graph-based security provides the context needed to move from reactive defence to proactive control. By understanding relationships, it’s much easier to identify which actions are expected and legitimate, and what represents actual threats. They are vital in not just observing behaviour, but understanding intention, scope, and consequence.
This is critical in environments where the majority of attacks don’t come in through the front door but spread quietly from within. With graph thinking, security teams can respond to threats with precision and speed.
Segmentation must be the architecture of resilience
Once visibility is achieved, the next step is restriction. Without defined limits, even the most well-meaning AI agent can cause damage when misused.
Resilience must be built into the planning and rolling out of all agents. The best method for this is through segmentation which allows organisations to create limitations on an AI agent’s influence.
With segmentation in place, each agent is confined to a narrow operational zone. It can only communicate with approved systems and access the data it’s supposed to. This makes sure it only performs tasks explicitly defined as safe. Therefore, if compromised, its ability to do harm is contained to a minimal footprint.
Segmentation has to be seen as an enabler of real-time control when paired with contextual visibility from graph-based models. It adapts to changes, isolates risk proactively and provides a clear line of defence even when threats are evolving faster than they can be catalogued.
Designing for compromise, not hoping to avoid it
The most dangerous assumption an organisation can make is that AI agents will develop good judgment. They won’t. They don’t understand ethics, risk, or nuance. Instead, they act according to their coding. If the instruction is wrong but still valid, they will follow it. If an attacker finds a way in, the agent will cooperate.
That’s why AI systems must be built with failure in mind. This means implementing safe defaults and input validation and enforcing restrictive permissions and escalation protocols. Cautious behaviours must be prioritised over compliance.
Safeguarding an AI agent is only part of the equation. The infrastructure surrounding it must also assume that compromise is inevitable. Systems should be architected for containment, with boundaries that prevent escalation and observability that ensures threats don’t remain hidden.
The goal is not to make compromise impossible but to make it ineffective.
When attackers know they can’t move beyond the first point of compromise, the economics of an attack change dramatically. They lose the advantages of speed, reach, and stealth and defenders regain the ability to act deliberately, not reactively.
Security must think like an attacker and plan like a defender
AI agents are clearly changing enterprise operations by bringing speed, precision, and scale. Yet, they are also introducing news risks that existing security models were never built to handle. Unlike people, these systems don’t question intent or pause to consider consequences.
In this new reality, relying on identity, trust, and intent-based logic is no longer enough. Defending modern environments means recognising that AI agents, once compromised, can act swiftly and silently across interconnected systems.
The future won’t be secured by hoping agents make the right decisions. It will be secured by designing systems where they’re only able to make the safe ones.