Agentic AI has gone from research labs to the trading floor in under three years. J.P. Morgan and CrowdStrike have both unveiled their agentic AI solutions which are solving real-world business problems and interfacing with clients. OpenAI has just released it’s ‘AI Agent’ with great hype -demonstrating the technology’s capabilities are ever increasing.
These early wins prove the commercial appetite, but this nascent technology is still largely considered too risky to be introduced to Critical National Infrastructure (CNI) and highly regulated environments, in which the blast radius of something going wrong is too high. Until we build CNIgrade security development practices around data sovereignty, safe autonomy, model robustness and realtime assurance, largescale adoption of this technology may still be some way off.
The Question: What Will it Take?
That is a question which has long interested me in my AI research. In this article, I outline a blueprint for the key areas of concern that we will need to address for CNI to leverage autonomous AI.
This is based on several pieces of prior research such as the Agentic Red Team Guide, the National Cyber Security Centre (NCSC)’s Secure AI Development Practices, sector-specific codes of conduct for AI development, and my own experience delivering security testing in production CNI environments over the last six years.
Data Privacy is Non-Negotiable
Some of the very first real-world use-cases for agentic AI were in the world of sales, social media and marketing. Many of these environments were the perfect test bed for untested technologies, in part because there are far fewer data privacy concerns than in many corporate or critical environments which can limit technology adoption.
In the latter, the sensitive nature of the data being handled (think classified Government information) means that data privacy is non-negotiable. As such, AI adoption, which typically includes sending data off to AI providers, is a tricky subject. Fortunately for us, this is not an agentic-only problem but one which has existed for several years in the generative AI era too.
As such, we do have some idea about what is working here. I’ll discuss the two main approaches that I’ve seen leveraged in critical environments.
- The use open-source models hosted locally, but this requires technical know-how for implementation, large amounts of compute (even for inference) and performance issues vs frontier models. However, if complete data sovereignty is required this is your go to.
- The second approach leverages existing data boundaries. For example, Azure OpenAI allows organisations to use the latest OpenAI models whilst keeping the data within Microsoft’s control and not being shared with the AI provider (OpenAI in this case). Most CNI clients already trust cloud providers with their data, and NCSC even actively encourage it.
This allows clients to maintain a data boundary they have already risk-accepted whilst getting access to the latest and greatest models.
Tame Autonomy
Let’s start with a simple rule: irreversible or safety-critical actions should require human-in-the-loop confirmation, at least to start. This approach layered with additional security controls limits the potential for something going badly wrong in these critical environments.
Additionally, we must develop strict guardrails around AI agents: tight restrictions on which tools they have access to, least-privilege identities, kill switches and safe failovers are all going to be required here.
Going deeper, agents should have utility-based decision engines that score actions against a few defined criteria, rather than the superficial ‘maximise X’ approach of goal-based agents which have a spotty past for ‘reward hacking‘.
None of this removes risk, but it does limit the blast radius: if an agent hallucinates or goes rogue, the damage should be contained rather than costing hours of plant downtime, as an example.
Prepare for the Inevitable: Edge Cases
Assume edge cases are inevitable and prepare accordingly. We operate in a world where even frontier LLMs can be tricked with simple prompts and a small amount of concentrated effort.
However strong the guardrails, sooner or later an agent will behave in an unexpected way, whether that be from a bad actors or simply an unforeseen operating environment. Agents should therefore feed detailed logs into the existing security stack, where detection rules can flag suspicious tool use, privilege changes or anomalous behaviour.
Incident response playbooks should map out high-impact scenarios so responders can act quickly and reduce potential impacts. As always, regular adversarial testing (both automated and human) is vital to keep these playbooks honest.
Regulation is Evolving
Threaded through the entire topic of AI adoption is an emerging body of regulation, which is especially prominent in the world of critical and regulated environments.
Whilst there are limited mandatory controls in the UK at the time of writing, this is quickly changing. The EU AI Act, NIS 2 and the forthcoming ISO 42001 standard all ask broadly the same questions:
Do you know where your models came from?
Can you explain your agent’s decisions?
Can you prove the controls actually work?
Organisations that bake these artefacts into their pipelines now will stay ahead of curve in these highly regulated environments.
Delivering on the Promise of Agentic AI
In summary, none of the measures above are novel. They borrow from secure software engineering, zero-trust networking and safety-critical system management.
The difference is that we must apply them simultaneously and without exception. Strong guardrails around AI agents, but entirely neglecting supply chain risk, may end in the same catastrophic result.
Defining what good looks like in the world of agentic AI security, and then delivering on it, will be the only acceptable approach to bring this promising technology into the environments which most closely affect our daily lives.
We are not quite there yet, but the path is visible.