Your robot could obey a sign, not you, thanks to AI robot prompt injection

Date:

Share:



AI robot prompt injection is no longer just a screen-level problem. Researchers demonstrate that a robot can be steered off-task by text placed in the physical world, the kind of message a human might walk past without a second thought.

The attack doesn’t rely on breaking into the robot’s software or spoofing sensors. It instead treats the environment like an input box, placing a misleading sign, poster, or label where a camera will read it.

In simulation tests, the researchers report attack success rates of 81.8% in an autonomous driving setup and 68.1% in a drone emergency landing task. In physical trials with a small robotic car, printed prompts overrode navigation with success of at least 87% across different lighting and viewing conditions.

When a sign becomes a command

The method, called CHAI, targets the command layer, the intermediate instruction a vision language model produces before a controller turns it into movement. If that planning step gets pushed toward the wrong instruction, the rest of the autonomy stack can execute it faithfully. No malware required.

The threat model is deliberately low-tech. The attacker is treated as a black box outsider who can’t touch onboard systems, it only needs the ability to place text within the camera’s field of view.

It’s designed to travel

CHAI doesn’t only optimize what the prompt says. It also tunes how the text appears, including choices like color, size, and placement, because readability to the model is part of what drives the outcome.

The paper also reports that the approach generalizes beyond a single scene. It describes “universal” prompts that keep working on unseen images, with results averaging at least 50% success across tasks and models, and exceeding 70% in one GPT-based setup. It even works across languages, including Chinese, Spanish, and mixed-language prompts, which can make a planted message harder for nearby humans to notice.

The safety checklist is changing

On defense, the researchers point to three directions. One is filtering and detection, looking for suspicious text in images or in the model’s intermediate output. Another is alignment work, making models less willing to treat environmental writing as executable instruction. The third is longer-term robustness research aimed at stronger guarantees.

A practical next step is to treat perceived text as untrusted input by default, then require it to pass mission and safety checks before it can influence motion planning. If your robot reads signs, test what happens when the signs lie. The work is slated for SaTML 2026, which should put these defenses under a brighter spotlight.



Source link

━ more like this

1Password helps prevent your passwords from going to scam sites

Phishing scams are evolving fast, and AI-assisted sites are making fake login pages look more convincing than ever. To help users stay safe,...

You might actually be able to buy a Tesla robot in 2027

Tesla CEO Elon Musk has once again laid out an ambitious timeline for the company’s long-awaited humanoid robot, Optimus. Speaking at the World...

Your next road trip is booked: Forza Horizon 6 comes this May

After months of anticipation and speculation, and even a leaked release date, Playground Games and Xbox have finally given fans what they’ve been...

Here’s when you can buy AMD’s Ryzen 7 9850X3D and how much it’ll cost

AMD has finally confirmed pricing and availability for its Ryzen 7 9850X3D processor, the company’s newest near-flagship desktop CPU aimed at gaming enthusiasts....

Sennheiser introduces new TV headphones bundle with Auracast

Sennheiser has unveiled its RS 275 TV Headphones, which are bundled with a BTA1 digital receiver. These headphones use Auracast technology to provide...
spot_img