Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI

Date:

Share:


In response to the suits, defendants such as Meta, OpenAI, and Bloomberg have argued that their actions constitute fair use. A case against EleutherAI, which originally scraped the books and made them public, was voluntarily dismissed by the plaintiffs.

Litigation in remaining cases remains in the early stages, leaving the questions surrounding permission and payment unresolved. The Pile has since been removed from its official download site, but it’s still available on file-sharing services.

“Technology companies have run roughshod,” said Amy Keller, a consumer protection attorney and partner at the firm DiCello Levitt who has brought lawsuits on behalf of creatives whose work was allegedly scooped up by AI firms without their consent.

“People are concerned about the fact that they didn’t have a choice in the matter,” Keller said. “I think that’s what’s really problematic.”

Parroting a Parrot

Many creators feel uncertain about the path ahead.

Full-time YouTubers patrol for unauthorized use of their work, regularly filing takedown notices, and some worry it’s only a matter of time before AI can generate content similar to what they make—if not produce outright copycats.

Pakman, the creator of The David Pakman Show, saw the power of AI recently while scrolling on TikTok. He came across a video that was labeled as a Tucker Carlson clip, but when Pakman watched it, he was taken aback. It sounded like Carlson but was, word for word, what Pakman had said on his YouTube show, down to the cadence. He was equally alarmed that only one of the video’s commenters seemed to recognize that it was fake—a voice clone of Carlson reading Pakman’s script.

“This is going to be a problem,” Pakman said in a YouTube video he made about the fake. “You can do this essentially with anybody.”

EleutherAI cofounder Sid Black wrote on GitHub that he created YouTube Subtitles by using a script. That script downloads the subtitles from YouTube’s API in the same way a YouTube viewer’s browser downloads them when watching a video. According to documentation on GitHub, Black used 495 search terms to cull videos, including “funny vloggers,” “Einstein,” “black protestant,” “Protective Social Services,” “infowars,” “quantum chromodynamics,” “Ben Shapiro,” “Uighurs,” “fruitarian,” “cake recipe,” ”Nazca lines,” and “flat earth.”

Though YouTube’s terms of service prohibit accessing its videos by “automated means,” more than 2,000 GitHub users have bookmarked or endorsed the code.

“There are many ways in which YouTube could prevent this module from working if that was what they are after,” wrote machine learning engineer Jonas Depoix in a discussion on GitHub, where he published the code Black used to access YouTube subtitles. “This hasn’t happened so far.”

In an email to Proof News, Depoix said he hasn’t used the code since he wrote it as a university student for a project several years ago and was surprised people found it useful. He declined to answer questions about YouTube’s rules.

Google spokesperson Jack Malon said in an email response to a request for comment that the company has taken “action over the years to prevent abusive, unauthorized scraping.” He did not respond to questions about other companies’ use of the material as training data.

Among the videos used by AI companies are 146 from Einstein Parrot, a channel with nearly 150,000 subscribers. The African grey’s caretaker, Marcia, who didn’t want to use her last name for fear of endangering the famous bird’s safety, said at first she thought it was funny to learn AI models had ingested words of a mimicking parrot.

“Who would want to use a parrot’s voice?” Marcia said. “But then, I know that he speaks very well. He speaks in my voice. So he’s parroting me, and then AI is parroting the parrot.”

Once ingested by AI, data cannot be unlearned. Marcia was troubled by all the unknown ways in which her bird’s information could be used, including creating a digital duplicate parrot and, she worried, making it curse.

“We’re treading on uncharted territory,” Marcia said.



Source link

━ more like this

These 3 features on the S26 Ultra makes me miss my iPhone 17 Pro even more

Switching phones is always a gamble. You expect something new, something exciting – maybe even something better. And to be fair, the Galaxy...

Apple at 50: The Pippin was a flop in 1996, but I’m ready for Apple’s bold gaming bet in 2026

On April 1, 2026, Apple turns 50. And while most celebrations will focus on the iPhone and Mac, there’s one chapter that’s hard...

You can finally access Google Photos on Samsung TVs

For years, accessing Google Photos on a TV has been… unnecessarily complicated. You either had to cast from your phone, rely on screensavers,...

They’re on their way! NASA launches humans to moon for first time in 53 years

Humans are heading to the moon for the first time in 53 years after NASA successfully launched four astronauts on its SLS rocket...

Kia finally brings the entry-level EV3 SUV to the US market

Kia is finally bringing one of its most important EVs to the US, and it’s not trying to go big, flashy, or expensive....
spot_img