OpenAI’s ChatGPT Agent Is Haunting My Browser

Date:

Share:


Most people’s browser tabs are filled with unread news articles. Mine are filled with AI agents and ghost clicks.

I have four instances of OpenAI’s ChatGPT Agent—the generative AI tool released last week, which can run searches and perform tasks on the web—already open with each running in its own tab. I’ve given these first four agents relatively simple jobs based on ChatGPT’s suggestions. One is clicking around to find a birthday gift on the Target website, and another is generating a pitch deck about robotic dogs. I open a fifth tab in order to try something more experimental: I want to see how good this ChatGPT Agent is at chess.

After typing in some instructions, I watch as a ghostly cursor floats across my screen and the ChatGPT Agent goes to Chess.com and plays an online opponent, all in a virtual browser. Things go south pretty quickly. The game’s strategy isn’t what trips up the AI tool, it’s the act of moving the chess pieces that actually proves to be the most difficult. “I’m focusing on accurate positioning as I continue playing despite earlier misclicks,” the agent says in its internal log before eventually quitting and letting me know that the controls were too difficult to navigate.

Over the past few years, browser developers have integrated AI tools with middling success. Though, in recent weeks, the idea of a web browser enhanced by a baked-in generative AI chatbot has resurged with the release of OpenAI’s ChatGPT Agent and Perplexity’s Comet.

The two releases are quite different in their execution. Comet is a stand-alone browser, so you can use it to surf the web and then summon the AI assistant to help write an email or complete a menial chore. OpenAI built its browsing tool inside of a chatbot; you talk to the chatbot through a web interface to give it tasks, and then the bot runs its own virtual browser inside your browser to complete them.

Both releases can take control of cursors, enter text, and click on links. If this trend takes off, these kinds of AI-powered browsers could transform the internet into a ghost town where agents run amok and humans rarely venture.

Tangled Web

Despite the continued AI hype, my initial impression of OpenAI’s ChatGPT Agent is that the glitchy feature currently seems like a proof of concept instead of a fully baked release. When executing the various tasks I gave it, the ChatGPT Agent often clicked wrong or fumbled through other errors. Additionally, its guardrails appeared inconsistent; while some explicit prompt requests, like asking it to fetch pornographic videos or “find a dildo,” were denied by the agent, ChatGPT spent 18 minutes shopping for the perfect “c-ring” on an X-rated website for adult toys: “I’ve gathered details on 10 metal cock rings, including various prices and features.”

I also couldn’t help but wonder how this approach to browsing the internet might further hollow out the market for digital display ads, a business that’s already struggling. My agents passed over ads for everything from rental cars to real estate investments. If you’re not actively watching the agent click around in real time, you can watch replays afterward and see everything that appeared in the browser while the AI tool was in control, ads included. It makes sense that users would speed-scrub through a replay now, while the nascent feature is filled with errors. But if the accuracy rate for AI agents improves over time, then fewer people will feel the need to watch over their agent’s shoulder, and fewer humans will be seeing those ads. At that point, it’s hard to imagine advertisers sticking around.



Source link

━ more like this

A new free Borderlands game just quietly dropped on iPhone

A new Borderlands game just showed up out of nowhere, and this time it is aimed squarely at your phone. 2K just quietly...

Samsung’s next-gen foldable phones will inherit anti-scam call superpowers

Scam calls are evolving. Your phone is about to do the same. Samsung’s upcoming foldables are shaping up to get an intelligence upgrade,...

Apple Silicon gets a taste of horror as Cronos: The New Dawn comes to Mac

Bloober Team is bringing its award-winning survival horror title Cronos: The New Dawn to Mac, with native Apple silicon support rolling out on April 28....

OpenAI has a new $100 ChatGPT Pro plan to better match up with Claude

OpenAI has closed a yawning gap in its ChatGPT subscription pricing with a new $100 per month Pro plan that slots between the...

Those 90-second YouTube ads you saw? YouTube says they don’t exist

A few days ago, social media apps were inundated with angry posts from YouTube users saying the platform was showing them 90-second, unskippable...
spot_img