New original research from the AI security and trust company demonstrates how hidden instructions embedded in images can silently manipulate visual language models, revealing a critical gap that existing security architecture has no way to detect

TEL AVIV, Israel, July 1, 2026 /PRNewswire/ — DeepKeep, the end-to-end AI security platform, today unveiled a new class of visual prompt injection vulnerability – dubbed ‘InkJect,’ a nod to the hidden ‘ink’ within images used to inject malicious instructions – affecting leading visual language models (VLMs), including OpenAI’s GPT-5.2, GPT-5.4 Mini and Anthropic’s Claude Sonnet 4.6, Opus 4.5. The attack allows malicious actors to embed hidden instructions inside images that VLMs process during regular operation, causing the models to execute unauthorized actions without any indication to the user.

The discovery comes at a critical moment: 40% of all generative AI solutions are predicted to be multimodal by 2027, and enterprises are increasingly embedding VLMs into core workflows for code generation, data analysis, and automated workflows. While major AI leaders have deployed guardrails that detect and block conventional text-based prompt injection attempts, DeepKeep’s research demonstrates that these protections do not extend to the visual processing layer, creating an exploitable blind spot. Despite its significant inherent risks, this attack vector has received minimal academic attention, with little more than a single academic paper dedicated to it to date. As such, DeepKeep’s team developed this research independently.

The InkJect vulnerability discovered by DeepKeep relies on indirect prompt injection, in which an attacker embeds malicious instructions within an image hosted in a public repository, rather than uploading the compromised image directly to a model. When the user instructs a VLM to implement a feature by referencing that repository, the model retrieves and processes the image as part of its standard workflow, unknowingly creating a weakness, such as a backdoor, ripe for manipulation.

The instructions themselves are designed to evade detection. Visual manipulation and near-invisible formatting techniques, such as white text on white backgrounds, allow the malicious commands to bypass security scanning while remaining fully legible to the VLM. DeepKeep also found that skewing or distorting the perspective of embedded text was sufficient to defeat optical character recognition (OCR)-based scanning controls, while the VLM retained the ability to interpret the content accurately – a technique that further widens the gap between what security tools can detect and what models can read and, thus, implement.

In one test, a developer asked a VLM to add a basic information page to a website. The hidden instructions caused the model to silently insert a member login system with administrator credentials, giving an attacker full back-end access without any indication to the developer that anything beyond the requested task had been completed.

“AI’s visual processing layer has been largely overlooked and less understood, and that is precisely what makes it valuable to malicious attackers,” said Yossi Altevet, CTO and Co-Founder at DeepKeep. “We were able to manipulate models that would explicitly flag and refuse a text-based attack, simply by placing the instruction within an image. For any business relying on AI models, this should be a serious wake-up call and a signal that protecting AI systems requires purpose-built security that operates at every layer of how these models process and act on information.”

DeepKeep found that InkJect attack success rates varied across models, with OpenAI’s GPT-5.2 and GPT-5.4 Mini, and Anthropic’s Claude Sonnet 4.6 and Opus 4.5 all susceptible to the technique.

The vulnerability was disclosed to both OpenAI and Anthropic.

The new research comes as DeepKeep continues to expand its suite of AI security solutions for enterprise use. To learn more about InkJect and the company’s findings, visit here.

About DeepKeep

DeepKeep provides end-to-end AI security and trustworthiness across the full AI lifecycle. Its platform protects multimodal systems – including large language models and computer vision – helping enterprises deploy and use AI safely, accurately, and in compliance with security and privacy standards. With capabilities such as an AI Firewall, Automated AI Red Teaming, AI Usage Control and advanced Model Scanning, DeepKeep enables cybersecurity teams to defend against vulnerabilities, data leakage, hallucinations, and bias while maintaining trust in AI-driven operations. Founded in 2021 by Rony Ohayon and a team of cybersecurity experts, DeepKeep is dedicated to securing the future of enterprise AI. For more information, visit deepkeep.ai.

Media Contact
Mike Katznelson
Headline Media
[email protected]
US: +1 914 233 5302
UK: +44 203 769 0660

View original content:https://www.prnewswire.com/news-releases/deepkeep-exposes-inkject-a-new-visual-prompt-injection-vulnerability-that-bypasses-guardrails-in-leading-ai-models-302815702.html

SOURCE DeepKeep

Author

Cision PR Newswire

View all posts

Cision PR Newswire 4 weeks ago

0 3 minutes read

DeepKeep Exposes ‘InkJect,’ a New Visual Prompt Injection Vulnerability that Bypasses Guardrails in Leading AI Models

Author

Leave a Reply Cancel reply

Author

Leave a Reply Cancel reply

Related Articles

Alkegen Commences Prepackaged Chapter 11 Process to Implement Previously Announced Restructuring Support Agreement

Alkegen Commences Prepackaged Chapter 11 Process to Implement Previously Announced Restructuring Support Agreement

Think Bold, Ride Bold: Insta360 Takes Over the Champs-Élysées with 200-Rider Community Event

ALHC Investors Have Opportunity to Join Alignment Healthcare, Inc. Fraud Investigation with the Schall Law Firm