Google’s new AI tool Whisk uses images as prompts

Date:

Share:


Google has yet another AI tool to add to the pile. Whisk is a Google Labs image generator that lets you use an existing image as your prompt. But its output only captures your starter image’s “essence” rather than recreating it with new details. So, it’s better for brainstorming and rapid-fire visualizations than edits of the source image.

The company describes Whisk as “a new type of creative tool.” The input screen starts with a bare-bones interface with inputs for style and subject. This simple introductory interface only lets you choose from three predefined styles: sticker, enamel pin and plushie. I suspect Google found those three allowed for the kind of rough-outline outputs the experimental tool is most ideal for in its current form.

As you can see in the image above, it produced a solid image of a Wilford Brimley plushie. (Google’s terms forbid pictures of celebrities, but Wilford slipped through the gates, Quaker Oats in tow, without alerting the guards.)

Whisk also includes a more advanced editor (found by clicking “Start from scratch” from the main screen). In this mode, you can use text or a source image in three categories: subject, scene and style. There’s also an input bar to add more text for finishing touches. However, in its current form, the advanced controls didn’t produce results that looked anything like my queries.

For example, check out my attempt to generate the late Mr. Brimley in a lightbox scene in the style of a walrus plushie image I found online:

Screenshot of an AI generation tool producing images a man who looks a bit like Wilford Brimley.

Google / Screenshot by Will Shanklin for Tech Reader

Whisk spit out what looks like a vaguely Wilford Brimley-esque actor eating oatmeal inside a lightbox frame. As far as I can tell, that dude is not a plushie. So, it’s clear why Google recommends using the tool more for “rapid visual exploration” and less for production-ready content.

Google acknowledges that Whisk will only draw from “a few key characteristics” of your source image. “For example, the generated subject might have a different height, weight, hairstyle or skin tone,” the company warns.

To understand why, look no further than Google’s description of how Whisk works under the hood. It uses the Gemini language model to write a detailed caption of the source image you upload. It then feeds that description into the Imagen 3 image generator. So, the result is an image based on Gemini’s words about your image — not the source image itself.

Whisk is only available in the US, at least for now. You can try it at the project’s Google Labs site.



Source link

━ more like this

Waymo is taking its robotaxis overseas for the first time

Waymo is taking its robotaxis out of the U.S. for the first time as the company begins expanding testing internationally. A fleet of its...

All 6 Types of IVR Routing and When to Use Each One

Interactive Voice Response (IVR) routing is a way to guide your callers to the best-fit agent, department, or service center to answer their...

Google AI image generator uses other images as a muse

Google’s latest AI tool helps you automate image generation even further. The tool is called Whisk, and it’s based on Google’s latest Imagen...

Blackmagic’s Vision Pro camera is available for pre-order and costs $30,000

Watching videos on the is one of the few use-cases early adopters have found for the VR headset, but Apple’s produced only...

Catly does not use generative AI or contain NFTs, devs say

After stirring up controversy at The Game Awards, Catly developer SuperAuthenti Co. has clarified that its game does not use generative AI or blockchain...
spot_img