Study: Meta AI model can reproduce almost half of Harry Potter book

Date:

Share:



The Google Books precedent probably can’t protect Meta against this second legal theory because Google never made its books database available for users to download—Google almost certainly would have lost the case if it had done that.

In principle, Meta could still convince a judge that copying 42 percent of Harry Potter was allowed under the flexible, judge-made doctrine of fair use. But it would be an uphill battle.

“The fair use analysis you’ve gotta do is not just ‘is the training set fair use,’ but ‘is the incorporation in the model fair use?’” Lemley said. “That complicates the defendants’ story.”

Grimmelmann also said there’s a danger that this research could put open-weight models in greater legal jeopardy than closed-weight ones. The Cornell and Stanford researchers could only do their work because the authors had access to the underlying model—and hence to the token probability values that allowed efficient calculation of probabilities for sequences of tokens.

Most leading labs, including OpenAI, Anthropic, and Google, have increasingly restricted access to these so-called logits, making it more difficult to study these models.

Moreover, if a company keeps model weights on its own servers, it can use filters to try to prevent infringing output from reaching the outside world. So even if the underlying OpenAI, Anthropic, and Google models have memorized copyrighted works in the same way as Llama 3.1 70B, it might be difficult for anyone outside the company to prove it.

Moreover, this kind of filtering makes it easier for companies with closed-weight models to invoke the Google Books precedent. In short, copyright law might create a strong disincentive for companies to release open-weight models.

“It’s kind of perverse,” Mark Lemley told me. “I don’t like that outcome.”

On the other hand, judges might conclude that it would be bad to effectively punish companies for publishing open-weight models.

“There’s a degree to which being open and sharing weights is a kind of public service,” Grimmelmann told me. “I could honestly see judges being less skeptical of Meta and others who provide open-weight models.”

Timothy B. Lee was on staff at Ars Technica from 2017 to 2021. Today, he writes Understanding AI, a newsletter that explores how AI works and how it’s changing our world. You can subscribe here.



Source link

━ more like this

US deal to have Greenland ‘should and will be made’ as Trump ‘is serious’ – London Business News | Londonlovesbusiness.com

The US President’s envoy to Greenland, Governor Jeff Landry issued a warning on Friday saying that a deal to have the Arctic island...

Tech Reader Podcast: Why did Apple choose Gemini for next-gen Siri?

Apple's next-gen Siri is still far off, but this week the company announced that it'll be using Google's Gemini AI for its new...

Ukraine to receive ‘highly effective combat planes’ to fight Putin’s drones – London Business News | Londonlovesbusiness.com

The President of the Czech Republic has told President Volodymyr Zelensky that he will provide Ukraine with “highly effective combat planes” that will...

Get $100 off Apple’s Mac mini M4 desktop

The holiday season is fully in the rear view mirror and real life is here to stay. But that doesn't mean the time...

ChatGPT can now search your entire chat history for answers

ChatGPT can now more reliably find information from your earlier conversations. If you are a Plus or Pro subscriber, you can now search...
spot_img