OpenAI will use Reddit posts to train ChatGPT under new deal

Date:

Share:


Stuff posted on Reddit is getting incorporated into ChatGPT, Reddit and OpenAI announced on Thursday. The new partnership grants OpenAI access to Reddit’s Data API, giving the generative AI firm real-time access to Reddit posts.

Reddit content will be incorporated into ChatGPT “and new products,” Reddit’s blog post said. The social media firm claims the partnership will “enable OpenAI’s AI tools to better understand and showcase Reddit content, especially on recent topics.” OpenAI will also start advertising on Reddit.

The deal is similar to one that Reddit struck with Google in February that allows the tech giant to make “new ways to display Reddit content” and provide “more efficient ways to train models,” Reddit said at the time. Neither Reddit nor OpenAI disclosed the financial terms of their partnership, but Reddit’s partnership with Google was reportedly worth $60 million.

Under the OpenAI partnership, Reddit also gains access to OpenAI large language models (LLMs) to create features for Reddit, including its volunteer moderators.

Reddit’s data licensing push

The news comes about a year after Reddit launched an API war by starting to charge for access to its data API. This resulted in many beloved third-party Reddit apps closing and a massive user protest. Reddit, which would soon become a public company and hadn’t turned a profit yet, said one of the reasons for the sudden change was to prevent AI firms from using Reddit content to train their LLMs for free.

Earlier this month, Reddit published a Public Content Policy stating: “Unfortunately, we see more and more commercial entities using unauthorized access or misusing authorized access to collect public data in bulk, including Reddit public content. Worse, these entities perceive they have no limitation on their usage of that data, and they do so with no regard for user rights or privacy, ignoring reasonable legal, safety, and user removal requests.

In its blog post on Thursday, Reddit said that deals like OpenAI’s are part of an “open” Internet. It added that “part of being open means Reddit content needs to be accessible to those fostering human learning and researching ways to build community, belonging, and empowerment online.”

Reddit has been vocal about its interest in pursuing data licensing deals as a core part of its business. Its building of AI partnerships sparks discourse around the use of user-generated content to fuel AI models without users being compensated and some potentially not considering that their social media posts would be used this way. OpenAI and Stack Overflow faced pushback earlier this month when integrating Stack Overflow content with ChatGPT. Some of Stack Overflow’s user community responded by sabotaging their own posts.

OpenAI is also challenged to work with Reddit data that, like much of the Internet, can be filled with inaccuracies and inappropriate content. Some of the biggest opponents of Reddit’s API rule changes were volunteer mods. Some have exited the platform since, and following the rule changes, Ars Technica spoke with long-time Redditors who were concerned about Reddit content quality moving forward.

Regardless, generative AI firms are keen to tap into Reddit’s access to real-time conversations from a variety of people discussing a nearly endless range of topics. And Reddit seems equally eager to license the data from its users’ posts.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.



Source link

━ more like this

Some of Samsung’s best new Galaxy S26 AI features are headed to the S25

Samsung is apparently not keeping all of the best new Galaxy AI tricks locked to the Galaxy S26 series. The company has confirmed...

Mercedes-Benz recalls some G-Wagon EVs due to risk of wheels falling off

Before you take your electric G-Wagon for its next off-roading excursion, you may want to stop by an authorized dealer. The German automaker...

Laser chips promise faster, greener indoor wireless at gigabit speeds

Indoor wireless is hitting limits as more devices crowd the same spectrum. Streaming, video calls, and smart home gear are pushing networks harder...

Google’s latest Play Store fix cuts through messy app reviews

Google is rolling out a fix for one of the Play Store’s most frustrating problems, messy, hard-to-scan app reviews. A new feature lets...

Ukraine defies calls from allies to stop blowing up Russia’s oil facilities – London Business News | Londonlovesbusiness.com

Ukraine has continued targeting Russian oil infrastructure despite appeals from Western allies to limit such attacks amid rising global energy prices. Russian officials reported...
spot_img