Your AI browser can be hijacked by prompt injection, OpenAI just patched Atlas

Date:

Share:



OpenAI has shipped a security update to ChatGPT Atlas aimed at prompt injection in AI browsers, attacks that hide malicious instructions inside everyday content an agent might read while it works.

Atlas’s agent mode is built to act in your browser the way you would: it can view pages, click, and type to complete tasks in the same space and context you use. That also makes it a higher-value target, because the agent can encounter untrusted text across email, shared documents, forums, social posts, and any webpage it opens.

The company’s core warning is simple. Hackers can trick the agent’s decision-making by smuggling instructions into the stream of information it processes mid-task.

A hidden instruction, big consequences

OpenAI’s post highlights how quickly things can go sideways. An attacker seeds an inbox with a malicious email that contains instructions written for the agent, not the human.

Later, when the user asks Atlas to draft an out-of-office reply, the agent runs into that email during normal work and treats the injected instructions as authoritative. In the demo scenario, the agent sends a resignation letter to the user’s CEO, and the out-of-office never gets written.

If an agent is scanning third-party content as part of a legitimate workflow, an attacker can try to override the user’s request by hiding commands in what looks like ordinary text.

An AI attacker gets practice runs

To find these failures earlier, OpenAI says it built an automated attacker model and trained it end-to-end with reinforcement learning to hunt for prompt-injection exploits against a browser agent. The goal is to pressure-test long, realistic workflows, not just force a single bad output.

The attacker can draft a candidate injection, run a simulated rollout of how the target agent would behave, then iterate using the returned reasoning and action trace as feedback. OpenAI says privileged access to those traces gives its internal red team an advantage external attackers don’t have.

What to do with this now

OpenAI frames prompt injection as a long-term security problem, more like online scams than a bug you patch once. Its approach is to discover new attack patterns, train against them, and tighten system-level safeguards.

For users, you should use logged-out browsing when you can, scrutinize confirmations for actions like sending email, and give agents narrow, explicit instructions instead of broad “handle everything” prompts. If you’re still curious what AI browsing can do, then go with browsers that ship updates that benefit you.



Source link

━ more like this

Tech Reader review recap: Lots of Apple devices, Galaxy S26, Dell XPS 16 and more

Apple already announced a lot of new devices in 2026 and we’ve been busy reviewing them all. In this installment of our bi-weekly...

Google will still let you sideload apps, but there’s a catch now

With the upcoming Android developer verification rules, there’s been a growing concern regarding Google effectively killing sideloading Android apps. But Google says that’s...

A retro Starship Troopers shooter, a video store sim and other new indie games worth checking out

Welcome to our latest roundup of what's going on in the indie game space. There are a whole bunch of neat new games...

We keep finding the raw material of DNA in asteroids—what’s it telling us?

On Monday, a paper announcing that all four DNA bases had been found on an asteroid sparked a lot of headlines. But many...

I Tried DoorDash’s Tasks App and Saw the Bleak Future of AI Gig Work

The flash from my iPhone camera illuminates my dirty socks and underwear as I hold each item up for the video recording to...
spot_img