AI chatbots still struggle with news accuracy, study finds

A month-long experiment has raised fresh concerns about the reliability of generative AI tools as sources of news, after Google’s Gemini chatbot was found fabricating entire news outlets and publishing false reports. The findings were first reported by The Conversation, which conducted the investigation.

The experiment was led by a journalism professor specialising in computer science, who tested seven generative AI systems over a four-week period. Each day, the tools were asked to list and summarise the five most important news events in Québec, rank them by importance, and provide direct article links as sources. Among the systems tested were Google’s Gemini, OpenAI’s ChatGPT, Claude, Copilot, Grok, DeepSeek, and Aria.

The most striking failure involved Gemini inventing a fictional news outlet – examplefictif.ca – and falsely reporting a school bus drivers’ strike in Québec in September 2025. In reality, the disruption was caused by the withdrawal of Lion Electric buses due to a technical issue. This was not an isolated case. Across 839 responses collected during the experiment, AI systems regularly cited imaginary sources, provided broken or incomplete URLs, or misrepresented real reporting.

The findings matter because a growing number of people are already using AI chatbots for news

According to the Reuters Institute Digital News Report, six per cent of Canadians relied on generative AI as a news source in 2024. When these tools hallucinate facts, distort reporting, or invent conclusions, they risk spreading misinformation – particularly when their responses are presented confidently and without clear disclaimers.

For users, the risks are practical and immediate. Only 37 per cent of responses included a complete and legitimate source URL. While summaries were fully accurate in less than half of the cases, many were only partially correct or subtly misleading. In some instances, AI tools added unsupported “generative conclusions,” claiming that stories had “reignited debates” or “highlighted tensions” that were never mentioned by human sources. These additions may sound insightful but can create narratives that simply do not exist.

Errors were not limited to fabrication

Some tools distorted real stories, such as misreporting the treatment of asylum seekers or incorrectly identifying winners of major sporting events. Others made basic factual mistakes in polling data or personal circumstances. Collectively, these issues suggest that generative AI still struggles to distinguish between summarising news and inventing context.

Looking ahead, the concerns raised by The Conversation align with a broader industry review. A recent report by 22 public service media organisations found that nearly half of AI-generated news answers contained significant issues, from sourcing problems to major inaccuracies. As AI tools become more integrated into search and daily information habits, the findings underscore a clear warning: when it comes to news, generative AI should be treated as a starting point at best – not a trusted source of record.

Source link

AI chatbots still struggle with news accuracy, study finds

The findings matter because a growing number of people are already using AI chatbots for news

Errors were not limited to fabrication

━ more like this

A new breed of Android flagships is coming and it should make Samsung nervous

Watch the trailer for Science Saru’s Ghost in the Shell anime series

Apple is opening Siri to pick AI models, but there’s only only that makes sense to me

YouTube CEO opens up about AI slop, and it sounds like cozy promises

Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses

━ about

━ follow us

━ subscribe