Stability AI’s audio generator can now crank out three-minute ‘songs’

Date:

Share:


Stability AI , an upgraded version of its music-generation platform. This system lets users create up to three minutes of audio via text prompt. That’s around the length of an actual song, so it’ll also whip up an intro, a full chord progression and an outro.

First, the good news. Three minutes is huge. The previous version of the software maxed out at 90 seconds. Just imagine the fake birthday song you could make in the style of that one Rob Thomas/Santana track. Another boon? The tool is free and publicly available through the company’s website, so have at it.

It primarily works via text prompt, but there’s an option to upload an audio clip. The system will analyze the clip and produce something similar. All uploaded audio must be copyright-free, so this isn’t for the purposes of mimicking something that already exists. Rather, it could be useful for, say, humming a drum part or extending a 20 second clip into something longer.

Now, the bad news. This is still AI-generated music. It’s cool as a conversation piece and as an emblem of a possible future that’s great for tinkerers and bad for musicians, but that’s about it. The songs can actually sound nifty, at first, until the seams start showing. Then things get a bit creepy.

For instance, the system loves adding vocals, but not in any known human language. I guess it’s in whatever language that makes up the text in AI-generated images. The vocals sort of sound like actual people, and other times they sound Gregorian chanters filtered through outer space. It’s right smack dab in the middle of that uncanny valley. The Verge “soulless and weird,” comparing them to whale sounds. That tracks.

Stable Audio 2.0 makes the same weird little mistakes that all of these systems make, no matter the output type. Parts can vanish into thin air, replaced with something else. Sometimes melodic elements will double out of nowhere, like an audio version of those extra fingers in AI-generated images.

There’s also the, well, boring-ness of it all. This is music in name only. Without a human connection, what’s the point? I listen to music to get inside the head of another person or group of people. There’s no head to get inside of here, despite constant proclamations that artificial general intelligence (AGI) is only months away.

So, this tech is an absolute gift for those making silly birthday videos or bank hold music. For everyone else? Shrug. One thing I can say from personal experience: It’s pretty fast. The system concocted an absolutely terrifying big band song about my cat in around a minute.





Source link

━ more like this

A new Galaxy S24 Ultra update is coming, and it sounds like a big deal | Tech Reader

Since its launch in January, the Samsung Galaxy S24 Ultra has received steady software updates. According to Ice Universe, the latest update, which is...

Moscow accused of ‘war crimes’ and now is the time ‘the world must respond to this genocide’ – London Business News | Londonlovesbusiness.com

Moscow has been accused of “barbaric” war crimes for targeting a children’s hospital in the Ukrainian capital Kyiv on Monday...

Samsung Unpacked 2024: How to watch Samsung unveil the Galaxy Ring, Galaxy Z Fold and more

Samsung’s summer event is nearly here. Unpacked 2024 will stream live on Wednesday, July 10, at 9AM ET. You can watch it on...

Microsoft Notepad just got its most important update ever | Tech Reader

After 41 years of being part of Windows, Notepad has finally been updated by Microsoft with two essential features: autocorrect and spellcheck. Given...

Early Prime Day deals bring the third-gen AirPods back down to $140

Amazon Prime Day is on the horizon but there's plenty of sales you can already explore. One of the best early Prime Day...
spot_img