OpenAI’s Sora 2 lets users insert themselves into AI videos with sound

Date:

Share:



On Tuesday, OpenAI announced Sora 2, its second-generation video-synthesis AI model that can now generate videos in various styles with synchronized dialogue and sound effects, which is a first for the company. OpenAI also launched a new iOS social app that allows users to insert themselves into AI-generated videos through what OpenAI calls “cameos.”

OpenAI showcased the new model in an AI-generated video that features a photorealistic version of OpenAI CEO Sam Altman talking to the camera in a slightly unnatural-sounding voice amid fantastical backdrops, like a competitive ride-on duck race and a glowing mushroom garden.

Regarding that voice, the new model can create what OpenAI calls “sophisticated background soundscapes, speech, and sound effects with a high degree of realism.” In May, Google’s Veo 3 became the first video-synthesis model from a major AI lab to generate synchronized audio as well as video. Just a few days ago, Alibaba released Wan 2.5, an open-weights video model that can generate audio as well. Now OpenAI has joined the audio party with Sora 2.

OpenAI demonstrates Sora 2’s capabilities in a launch video.

The model also features notable visual consistency improvements over OpenAI’s previous video model, and it can also follow more complex instructions across multiple shots while maintaining coherency between them. The new model represents what OpenAI describes as its “GPT-3.5 moment for video,” comparing it to the ChatGPT breakthrough during the evolution of its text-generation models over time.

Sora 2 appears to demonstrate improved physical accuracy over the original Sora model from February 2024, with OpenAI claiming the model can now simulate complex physical movements like Olympic gymnastics routines and triple axels while maintaining realistic physics. Last year, shortly after the launch of Sora 1 Turbo, we saw several notable failures of similar video-generation tasks that OpenAI claims to have addressed with the new model.

“Prior video models are overoptimistic—they will morph objects and deform reality to successfully execute upon a text prompt,” OpenAI wrote in its announcement. “For example, if a basketball player misses a shot, the ball may spontaneously teleport to the hoop. In Sora 2, if a basketball player misses a shot, it will rebound off the backboard.”



Source link

━ more like this

The Artemis II mission has started its 10-day journey around the moon

The Artemis II mission successfully launched into space on April 1, at 6:35pm Eastern time, from Launch Complex 39B at the Kennedy Space...

These 3 features on the S26 Ultra makes me miss my iPhone 17 Pro even more

Switching phones is always a gamble. You expect something new, something exciting – maybe even something better. And to be fair, the Galaxy...

Apple at 50: The Pippin was a flop in 1996, but I’m ready for Apple’s bold gaming bet in 2026

On April 1, 2026, Apple turns 50. And while most celebrations will focus on the iPhone and Mac, there’s one chapter that’s hard...

You can finally access Google Photos on Samsung TVs

For years, accessing Google Photos on a TV has been… unnecessarily complicated. You either had to cast from your phone, rely on screensavers,...
spot_img