OpenAI Messed With the Wrong Mega-Popular Parenting Forum

Date:

Share:


Think of any topic vaguely related to raising kids imaginable, and there’s probably a post about it on Mumsnet, the long-running, enormously popular, controversy-spurring UK-based parenting forum for mothers. Over its more than two decade-long history, Mumsnet has amassed an archive of more than six billion words written by its highly engaged user base, on topics such as dirty diapers and lazy husbands. (Not to mention a bonkers rant about dolphins.)

This spring, after Mumsnet discovered that AI companies were scraping its data, the company says it decided to try to strike licensing deals with some of the major players in the space, including OpenAI, which initially expressed willingness to explore an arrangement after Mumsnet first reached out. After talks with OpenAI fell apart, Mumsnet in July announced its intention to pursue legal action.

According to Mumsnet, during those early conversations, an OpenAI strategic partnership lead told the company that datasets over 1 billion words were of interest to the AI giant. Mumsnet’s leadership was excited. “We spent quite some time in a back-and-forth with them,” Mumsnet founder and CEO Justine Roberts tells WIRED. “We had to sign some NDAs, and they wanted a lot of information from us.”

However, over a month later, OpenAI told Mumsnet that the company was no longer interested in partnering at that time, according to an email exchange reviewed by WIRED. When asked why, the OpenAI staffer characterized Mumsnet’s 6 billion word dataset as too small to warrant a licensing arrangement, Roberts says. They also noted that OpenAI is primarily interested in large datasets that the public cannot already access online, and that it wanted datasets that captured broad human experience.

This sentiment was echoed by the company when asked for comment from WIRED. “We pursue partnerships for large-scale datasets that reflect human society and do not pursue partnerships solely for publicly available information,” says OpenAI spokesperson Kayla Wood. “We support publisher and creator choice, offering them ways to express their preferences about how their sites and content work with AI in search results and training generative AI foundation models.”

Roberts says she was “irritated” by this development. She recalls that OpenAI at first had seemed especially interested in Mumsnet because of the platform’s heavily female-written content. “It’s very high-quality conversational data,” she says. “It’s 90 percent female conversation, which is quite unusual.”

OpenAI has struck a variety of data-licensing deals with media outlets and platforms in the past year, entering into agreements with Vox Media, the Atlantic, Axel Springer, Time, and WIRED parent company Condé Nast, as well as platforms filled with user-generated content like Reddit. (Automattic, the owner of WordPress.com and Tumblr, was also said to be in licensing talks earlier this year.) As the particulars of those deals haven’t been revealed, it’s not clear what the size of their respective corpuses are.

When WIRED asked about the size of datasets it will consider for commercial licensing, OpenAI declined to share that information. But spokesperson Kayla Wood emphasizes that the company’s partnerships with publishers are “focused on displaying their content in our products and driving traffic to them.”



Source link

━ more like this

Gear News of the Week: Apple’s AI Wearable and a Phone That Can Boot Android, Linux, and Windows

The NexPhone is a rugged device powered by a Qualcomm QCM64490 chip with a reportedly long support road map (through 2036), plus 12...

The quiet return of tailoring: Why women are choosing suits again – London Business News | Londonlovesbusiness.com

There are shifts in fashion that arrive without noise. They don’t demand attention, yet they gradually reshape the way people dress with surprising...

You can now enjoy Substack on a TV, if that’s your idea of fun times

Substack has carved out a massive niche for itself as the “quiet corner” of the internet—the place you go to escape the noise...

Google Research suggests AI models like DeepSeek exhibit collective intelligence patterns

It turns out that when the smartest AI models “think,” they might actually be hosting a heated internal debate. A fascinating new study...

Talk to AI every day? New research says it might signal depression

Spending time chatting with AI assistants like ChatGPT, Google Gemini, Microsoft Copilot, or similar systems might be more than just a tech habit....
spot_img