Google’s new robotics AI can run without the cloud and still tie your shoes

Date:

Share:



We sometimes call chatbots like Gemini and ChatGPT “robots,” but generative AI is also playing a growing role in real, physical robots. After announcing Gemini Robotics earlier this year, Google DeepMind has now revealed a new on-device VLA (vision language action) model to control robots. Unlike the previous release, there’s no cloud component, allowing robots to operate with full autonomy.

Carolina Parada, head of robotics at Google DeepMind, says this approach to AI robotics could make robots more reliable in challenging situations. This is also the first version of Google’s robotics model that developers can tune for their specific uses.

Robotics is a unique problem for AI because, not only does the robot exist in the physical world, but it also changes its environment. Whether you’re having it move blocks around or tie your shoes, it’s hard to predict every eventuality a robot might encounter. The traditional approach of training a robot on action with reinforcement was very slow, but generative AI allows for much greater generalization.

“It’s drawing from Gemini’s multimodal world understanding in order to do a completely new task,” explains Carolina Parada. “What that enables is in that same way Gemini can produce text, write poetry, just summarize an article, you can also write code, and you can also generate images. It also can generate robot actions.”

General robots, no cloud needed

In the previous Gemini Robotics release (which is still the “best” version of Google’s robotics tech), the platforms ran a hybrid system with a small model on the robot and a larger one running in the cloud. You’ve probably watched chatbots “think” for measurable seconds as they generate an output, but robots need to react quickly. If you tell the robot to pick up and move an object, you don’t want it to pause while each step is generated. The local model allows quick adaptation, while the server-based model can help with complex reasoning tasks. Google DeepMind is now unleashing the local model as a standalone VLA, and it’s surprisingly robust.



Source link

━ more like this

Philips Hue says US prices will go up in July because of tariffs

When Philips Hue’s US prices go up next month, you can officially know who to blame. (Hint: It rhymes with "rump.") Parent company...

Judge rules Anthropic’s AI training on copyrighted materials is fair use

Anthropic has a mixed result in a class action lawsuit brought by a group of authors who claimed the company used their...

Roland just released a weird little riff on an acoustic handpan

Synthesizer giant Roland just announced a fairly bizarre contraption. The Mood Pan is the company's digital take on an acoustic handpan. For the...
spot_img