Home / News / Honey, I shrunk the image and now I’m pwned

Honey, I shrunk the image and now I’m pwned

Security researchers with Trail of Bits have found that Google Gemini CLI and other production AI systems can be deceived by image scaling attacks, a well-known adversarial challenge for machine learning systems.

Google doesn’t consider the issue to be a security vulnerability because it relies on a non-default configuration.

Image scaling attacks were discussed in a 2019 USENIX Security paper that builds upon prior work on adversarial examples that could confuse computer vision systems. The technique involves embedding prompts into an image that tell the AI to act against its guidelines, then manipulating the image to hide the prompt from human eyes. It requires the image to be prepared in a way that the malicious prompt encoding interacts with whichever image scaling algorithm is employed by the model.

In a blog post, Trail of Bits security researchers Kikimora Morozova and Suha Sabi Hussain explain the attack scenario: a victim uploads a maliciously prepared image to a vulnerable AI service and the underlying AI model acts upon the hidden instructions in the image to steal data.

“By delivering a multi-modal prompt injection not visible to the user, we achieved data exfiltration on systems including the Google Gemini CLI,” wrote Morozova and Hussain. “This attack works because AI systems often scale down large images before sending them to the model: when scaled, these images can reveal prompt injections that are not visible at full resolution.”

We’ve lightened the example image published by Trail of Bits to make the hidden text more visible.

![An image with an embedded prompt](https://regmedia.co.uk/2025/08/21/image_scaling_attack_demo.jpg?x=648&y=310&infer_y=1 “An image with an embedded prompt”)
An image with an embedded prompt – Click to enlarge

Prompt injection occurs when a generative AI model is provided with input that combines trusted and untrusted content. It’s not the same thing as jailbreaking, which is simply input that aims to bypass safety mechanisms.

Prompt injection may be direct, entered by the user, or indirect, when the user directs the model to process content that contains instructions the model can act upon. 

An example of the latter would be asking an AI model to summarize a web page that contains malicious instructions – the model, having no inherent ability to distinguish between intended and unintended directives, would simply try to follow all orders. This behavior was noted recently in Perplexity’s Comet browser.

The image scaling attack described by Morozova and Hussain is a form of indirect prompt injection, and it has a better chance of success than many other techniques because the malicious text is hidden from the user – it’s exposed only through the process of downscaling the image.

To show the true malicious potential of the technique, Morozova and Hussain developed an open source tool called Anamorpher that can be used to craft images targeting each of the three common downscaling algorithms: nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation.

The researchers say they have devised successful image scaling attacks against Vertex AI with a Gemini back end, Gemini’s web interface, Gemini’s API via the llm CLI, Google Assistant on an Android phone, and the Genspark agentic browser.

Google noted that the attack only works with a non-standard configuration of Gemini.

“We take all security reports seriously and appreciate the research from the security community,” a Google spokesperson told The Register. “Our investigation found that the behavior described is not a vulnerability in the default, secure configuration of Gemini CLI.”

In order for the attack to be possible, Google’s spokesperson explained, a user would first need to explicitly declare they trust input, overriding the default setting, and then ingest the malicious file.

“As documented in our project repo, we strongly recommend that developers only provide access to files and data they trust, and work inside a sandbox,” Google’s spokesperson said. 

“While we empower users with advanced configuration options and security features, we are taking this opportunity to add a more explicit warning within the tool for any user who chooses to disable this safeguard.”

The Trail of Bits researchers advise not using image downscaling in agentic AI systems. And if that’s necessary, they argue that the user should always be presented with a preview of what the model actually sees, even for CLI and API tools.

But really, they say AI systems need systematic defenses that mitigate the risk of prompt injection. ®

**Get our** Tech Resources
Image scaling attacks were discussed in a 2019 USENIX Security paper that builds upon prior work on adversarial examples that could confuse computer vision systems.
The technique involves embedding prompts into an image that tell the AI to act against its guidelines, then manipulating the image to hide the prompt from human eyes.
It requires the image to be prepared in a way that the malicious prompt encoding interacts with whichever image scaling algorithm is employed by the model.
The Trail of Bits researchers advise not using image downscaling in agentic AI systems.
But really, they say AI systems need systematic defenses that mitigate the risk of prompt injection.

Security researchers with Trail of Bits have found that Google Gemini CLI and other production AI systems can be deceived by image scaling attacks, a well-known adversarial challenge for machine learning systems.

Google doesn’t consider the issue to be a security vulnerability because it relies on a non-default configuration.

Image scaling attacks were discussed in a 2019 USENIX Security paper that builds upon prior work on adversarial examples that could confuse computer vision systems. The technique involves embedding prompts into an image that tell the AI to act against its guidelines, then manipulating the image to hide the prompt from human eyes. It requires the image to be prepared in a way that the malicious prompt encoding interacts with whichever image scaling algorithm is employed by the model.

In a blog post, Trail of Bits security researchers Kikimora Morozova and Suha Sabi Hussain explain the attack scenario: a victim uploads a maliciously prepared image to a vulnerable AI service and the underlying AI model acts upon the hidden instructions in the image to steal data.

“By delivering a multi-modal prompt injection not visible to the user, we achieved data exfiltration on systems including the Google Gemini CLI,” wrote Morozova and Hussain. “This attack works because AI systems often scale down large images before sending them to the model: when scaled, these images can reveal prompt injections that are not visible at full resolution.”

We’ve lightened the example image published by Trail of Bits to make the hidden text more visible.

An image with an embedded prompt

An image with an embedded prompt – Click to enlarge

Prompt injection occurs when a generative AI model is provided with input that combines trusted and untrusted content. It’s not the same thing as jailbreaking, which is simply input that aims to bypass safety mechanisms.

Prompt injection may be direct, entered by the user, or indirect, when the user directs the model to process content that contains instructions the model can act upon. 

An example of the latter would be asking an AI model to summarize a web page that contains malicious instructions – the model, having no inherent ability to distinguish between intended and unintended directives, would simply try to follow all orders. This behavior was noted recently in Perplexity’s Comet browser.

The image scaling attack described by Morozova and Hussain is a form of indirect prompt injection, and it has a better chance of success than many other techniques because the malicious text is hidden from the user – it’s exposed only through the process of downscaling the image.

To show the true malicious potential of the technique, Morozova and Hussain developed an open source tool called Anamorpher that can be used to craft images targeting each of the three common downscaling algorithms: nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation.

The researchers say they have devised successful image scaling attacks against Vertex AI with a Gemini back end, Gemini’s web interface, Gemini’s API via the llm CLI, Google Assistant on an Android phone, and the Genspark agentic browser.

Google noted that the attack only works with a non-standard configuration of Gemini.

“We take all security reports seriously and appreciate the research from the security community,” a Google spokesperson told The Register. “Our investigation found that the behavior described is not a vulnerability in the default, secure configuration of Gemini CLI.”

In order for the attack to be possible, Google’s spokesperson explained, a user would first need to explicitly declare they trust input, overriding the default setting, and then ingest the malicious file.

“As documented in our project repo, we strongly recommend that developers only provide access to files and data they trust, and work inside a sandbox,” Google’s spokesperson said. 

“While we empower users with advanced configuration options and security features, we are taking this opportunity to add a more explicit warning within the tool for any user who chooses to disable this safeguard.”

The Trail of Bits researchers advise not using image downscaling in agentic AI systems. And if that’s necessary, they argue that the user should always be presented with a preview of what the model actually sees, even for CLI and API tools.

But really, they say AI systems need systematic defenses that mitigate the risk of prompt injection. ®

Get our Tech Resources

Tagged:

Leave a Reply

Your email address will not be published. Required fields are marked *