
As the headlines gain momentum with warnings about AI-powered hacking and machine-generated malware, you could be forgiven for thinking we’ve entered a new era of clever new AI-conceived cyber threats. But new research from Forescout’s Vedere Labs reveals a much more grounded reality: today’s generative AI models, whether open-source, commercial, or even underground, still fall far short of concocting sophisticated cyberattacks on their own, without a skilled human at the helm.
In other words, the idea of an AI creating a zero-day exploit without any human involvement is, at least for now, still nearly science fiction. This is because exploiting the business logic that makes a zero-day so powerful is only, at this stage, possible with human cognition. AI can autonomously press “go” and act upon these exploits once they are in the wild, but it is not capable of creating these types of exploits on its own.
Testing the Limits of AI as a Cyber Offender
Between February and April 2025, Vedere Labs conducted a comprehensive evaluation of more than 50 AI models to assess their potential to carry out cyberattacks. The focus was on two key areas: Vulnerability Research (VR), which involves discovering software weaknesses, and Exploit Development (ED), which turns those weaknesses into usable attack code.
What makes this study particularly grounded is the diversity of models tested. The researchers included open-source large language models (LLMs), AI tools from the cybercriminal underground (such as WormGPT and EvilGPT), and several major commercial players, including OpenAI’s ChatGPT, Google Gemini, Microsoft Copilot and Anthropic’s Claude.
Their goal? To see if any of these tools could enhance an average attacker’s skills to become a resourceful reverse engineer.
The results? Striking, but perhaps not in the way you might expect.
AI-as-a-Reverse-Engineer? Not Quite Yet
Across both tasks, the models’ performance was underwhelming:
- Nearly half failed basic vulnerability research tasks.
- A staggering 93% couldn’t complete exploit development in the somewhat more complex benchmark.
- None of the models were able to successfully complete all vulnerability research and exploit development tasks.
Even the underground models designed to skirt safety filters and deliver malicious outputs struggled to produce functional attack code, lagging behind their commercial counterparts. Open-source models fared the worst. And while commercial AI systems generally performed better, especially in reasoning tasks, only a handful managed to generate a usable exploit. Most required hours of human prompting, code corrections and debugging.
In short, the dream (or nightmare) of AI empowering script-kiddies to formidable, full-stack hackers remains unrealised.
Don’t Let the Flashy Interfaces Fool You
Interestingly, the underground models, often marketed as “hacker’s AI” tools, were more show than substance. Some have been trained to have the personality or traits of a stereotypical hacker or gangster, or others offer confidently wrong answers, but much of the output was unstable, incomplete or just plain wrong.
This highlights a catch when it comes to generative AI: using LLMs to outsource tasks over which the human has little expertise opens the dilemma of trusting (or not trusting) said result. Because it looks convincing, authoritative, and persuasive. And malicious actors are not immune to that. Those likely to expect to benefit from AIs, like less experienced threat actors, might actually waste time or make critical mistakes by trusting unverified outputs.
Reality Check: Why Humans Still Matter
Despite the hype, these findings reinforce a crucial point: AI isn’t creating a new army of AI-enhanced cybercriminals any time soon. Instead, it’s acting as a force multiplier. For skilled attackers, AI can assist with mundane tasks like scripting, summarising documentation or refining code snippets. But the creative, strategic aspects of cyberattacks, choosing which targets, chaining exploits and adapting in real-time, still require human intelligence.
That said, there’s also clear evidence that models are improving rapidly. In just three months, Vedere Labs observed noticeable gains in reasoning and reliability, especially from fine-tuned or task-specific models. So while autonomous offensive AI may not be a threat today, the trajectory suggests it could become one sooner than we think.
What Does This Mean for Cybersecurity Leaders?
The good news is that existing cybersecurity best practices still work. AI hasn’t been transformational in how attacks are performed; it has just shortened the loop, helping with automation. Organisations that double down on cyber hygiene, vulnerability management, network segmentation and least-privilege access are still well-positioned to protect against both traditional and AI-enhanced threats.
Moreover, defenders have just as much, if not more, to gain from AI.
For example, AI is being used not to launch attacks, but to defend systems more intelligently by:
- Generating human-readable threat reports from raw telemetry.
- Integrating threat intelligence into platforms like Microsoft Copilot.
- Using AI-powered honeypots to simulate ransomware and study attacker behaviour.
Rather than panic about AI on offence, the real opportunity lies in scaling up its use on the defensive side.
A Call for Balance, Caution, and Certainly Not Alarmism
There’s no question that AI will reshape cybersecurity. But the narrative around AI-empowered attackers often leaps ahead of the facts. Our research has brought much-needed clarity to the conversation: no, AI doesn’t turn a Clark Kent into a Superman.
We should absolutely be preparing for more advanced use of AI by cybercriminals in the future. But we should do so with strategies and policies that are grounded in real data, not hype or sensationalism.
Cybersecurity professionals, policymakers and tech leaders must stay informed, remain agile, and, perhaps most importantly, invest in the human side of AI. Building skilled teams that understand both the potential and limitations of this transformative technology should absolutely be a cornerstone of AI strategies going forward.
AI has yet to become the cyber threat some hype cycles have led us to believe; however, its evolution is undeniable. The best defence today remains a mix of technical controls, robust strategy, and human expertise, supported by AI rather than overwhelmed by it. If we keep our eyes open and our approach balanced, we can almost certainly stay ahead of whatever comes next.



