AI

Learnings from the Anthropic AI cyber exploitation

By Professor Keeley Crockett, senior IEEE member and professor in computational intelligence, Manchester Metropolitan University

US artificial intelligence (AI) company Anthropic recently admitted that its technology had been weaponised by hackers to carry out a number of sophisticated cyber attacks. There is growing concern about criminals using AI as the technology becomes more accessible. Anthropic described threat actors using its AI to what it termed an ‘unprecedented degree’, after detecting a case of ‘vibe hacking’, in which its AI was used to write code capable of hacking into 17 different organisations. This involved using Anthropic’s chatbot Claude to make decisions on what data to exfiltrate and how to draft psychologically targeted extortion demands. 

What is vibe hacking? 

Before understanding vibe hacking, it’s important to know what vibe coding is. Individuals have been empowered with lateral language prompts and iterative refinement to create new apps and software, even with limited programming skills. This capability allows them to generate complex code using plain language inputted into large language models (LLMs), which has become known as vibe coding.   

However, it can also aid hackers who can use LLMs to identify vulnerabilities and optimise exploits by automating tasks like code completion, bug detection, or even generating malicious payloads tailored to specific systems. They can describe malicious behaviour in plain language and the LLM can produce working scripts in return. While this activity is monitored on legitimate platforms like ChatGPT, and overly malicious prompts are blocked or sanitised, there are a number of ways to overcome this, such as by running a local LLM. This is what is known as vibe hacking. 

The main security implications of vibe coding are that without discipline, documentation, and review, such code can fail under attack. This increases the risk of sensitive data leaks and can create opportunities for threat actors, as seen in the example of Anthropic’s Claude. 

The state of play and future threats 

The news that Claude has been misused in this way should serve as a reminder of how quickly AI is advancing. It highlights how easily these tools can drift from their developer’s intended purposes. In this case, AI was used not only to write code but also to help shape decisions about which data to exploit, how to craft extortion demands and even what ransom amounts to suggest. Ultimately, the time needed to exploit vulnerabilities is shrinking and defenders can’t solely rely on being reactive.  

Unfortunately, this kind of exploitation serves as a clear warning of what could come next. These attacks didn’t directly involve fully agentic AI, or systems that can act with a degree of autonomy and pursue goals without continuous human direction. However, they illustrate that today’s powerful AI tools can accelerate cyber-crime. Agentic AI has been described as the next big step in the field, promising greater efficiency, but it also carries significant risks if attackers weaponise it to plan, adapt and act in real time. 

One major concern with agentic AI systems is their operation within the cloud. Data in transit, if improperly encrypted, could be intercepted, and systems could be hijacked, allowing attackers to impersonate an individual. Furthermore, multi-tenancy vulnerabilities in cloud infrastructure can allow data leakage between different agentic AI systems. If an agentic AI system makes use of third-party products or services, then third-party APIs can increase the number of potential security breaches, especially if there hasn’t been due diligence on the third-party provider. Another concern is that agentic systems could autonomously initiate data transfers without explicit human approval allowing the transmission of personal and sensitive data unknowingly. 

How to best prepare for AI attacks 

Looking ahead to the future, the use and development of AI is only going to increase, with a recent IEEE survey showing 96 percent of experts expect agentic AI to continue developing at a rapid pace in 2026. The emphasis therefore, needs to be on defence and protection. Safeguards, oversight and resilience must be built into intelligent systems. Even without full autonomy, AI is rapidly lowering the barriers for less skilled threat attackers and adding psychological sophistication to extortion. Businesses cannot afford to wait until agentic AI becomes mainstream as this would leave many organisations on the back foot.  

In recent years it has become routine for enterprise security teams to send simulated phishing emails to their employees, which when activated, lead to a site informing the employee about their mistake and educating them on the dangers of real-world phishing emails. Employees also now need to be trained on recognising fake audio and video that could have been created from AI. It is worth noting that there are no known tools that can accurately identify generative AI derived attacks as the modus operandi is to appear human-like.  

Companies should adhere to proper secure code development, which should include refactoring code for production use and to ensure security hygiene, such as input validation, the principle of least privilege, threat modelling, secure storage, and other well-established secure coding practices.  

Beyond firewalls and intrusion detection systems, the first line of defence against these attacks is to simply educate employees about the dangers of AI and vibe hacking. However, only a fraction will take this advice onboard. Generally, it takes people to make a mistake before they learn, and in these circumstances it may be one mistake too many. As AI usage continues to rise and the associated threats increase in the new year, organisations must begin to take a more serious approach to address vibe hacking.  

Author

Related Articles

Back to top button