AIs can generate near-verbatim copies of novels from training data

Date:

Share:



A US court last year found that Anthropic’s training of LLMs on some copyrighted content could be considered fair use as it was deemed “transformative.”

But it determined that storing pirated works was “inherently, irredeemably infringing,” which then led the AI group to pay $1.5 billion to settle the lawsuit.

In Germany, a ruling from November last year found that OpenAI had infringed on copyright because its model had memorized song lyrics. The case, brought by GEMA, an association representing composers, lyricists, and publishers, was considered a landmark ruling in the EU.

Rudy Telscher, a partner at law firm Husch Blackwell, said reproducing an entire book without jailbreaking is “clearly a copyright violation.” But “it’s a matter of whether this is happening enough that [AI models] could be vicariously liable for the infringement,” he added.

Anthropic said the jailbreaking technique used in the Stanford and Yale research was impractical for normal users and would require more effort to extract the text than just purchasing the content.

The company also added that its model does not store copies of specific datasets but learns from patterns and relationships between words and strings in its training data.

xAI, OpenAI, and Google did not respond to requests for comment.

The fact that AI labs have put safeguards in place to prevent training data from being extracted means they are aware of the problem, said Imperial’s de Montjoye.

Ben Zhao, a computer science professor at the University of Chicago, questioned whether AI labs really needed to use copyrighted content in training data to create cutting-edge models in the first place.

“Whether the technical result can be done or not, it’s still a question of should we be doing this?” Zhao said. “The legal side should eventually hold their ground and really be the arbiter in this whole process.”

© 2026 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.



Source link

━ more like this

The Artemis II mission has started its 10-day journey around the moon

The Artemis II mission successfully launched into space on April 1, at 6:35pm Eastern time, from Launch Complex 39B at the Kennedy Space...

These 3 features on the S26 Ultra makes me miss my iPhone 17 Pro even more

Switching phones is always a gamble. You expect something new, something exciting – maybe even something better. And to be fair, the Galaxy...

Apple at 50: The Pippin was a flop in 1996, but I’m ready for Apple’s bold gaming bet in 2026

On April 1, 2026, Apple turns 50. And while most celebrations will focus on the iPhone and Mac, there’s one chapter that’s hard...

You can finally access Google Photos on Samsung TVs

For years, accessing Google Photos on a TV has been… unnecessarily complicated. You either had to cast from your phone, rely on screensavers,...
spot_img