OpenAI’s Whisper invents parts of transcriptions — a lot

Date:

Share:


Imagine going to the doctor, telling them exactly how you’re feeling and then a transcription later adds false information and alters your story. That could be the case in medical centers that use Whisper, OpenAI’s transcription tool. Over a dozen developers, software engineers and academic researchers have found evidence that Whisper creates hallucinations — invented text — that includes made up medications, racial commentary and violent remarks, ABC News reports. Yet, in the last month, open-source AI platform HuggingFace saw 4.2 million downloads of Whisper’s latest version. The tool is also built into Oracle and Microsoft’s cloud computing platforms, along with some versions of ChatGPT.

The harmful evidence is quite extensive, with experts finding significant faults with Whisper across the board. Take a University of Michigan researcher who found invented text in eight out of ten audio transcriptions of public meetings. In another study, computer scientists found 187 hallucinations while analyzing over 13,000 audio recordings. The trend continues: A machine learning engineer found them in about half of 100 hours-plus worth of transcriptions, while a developer spotted hallucinations in almost all of the 26,000 transcriptions he had Whisper create.

The potential danger becomes even clearer when looking at specific examples of these hallucinations. Two professors, Allison Koenecke and Mona Sloane of Cornell University and the University of Virginia, respectively, looked at clips from a research repository called TalkBank. The pair found that nearly 40 percent of the hallucinations had the potential to be misinterpreted or misrepresented. In one case, Whisper invented that three people discussed were Black. In another, Whisper changed “He, the boy, was going to, I’m not sure exactly, take the umbrella.” to “He took a big piece of a cross, a teeny, small piece … I’m sure he didn’t have a terror knife so he killed a number of people.”

Whisper’s hallucinations also have risky medical implications. A company called Nabla utilizes Whisper for its medical transcription tool, used by over 30,000 clinicians and 40 health systems — so far transcribing an estimated seven million visits. Though the company is aware of the issue and claims to be addressing it, there is currently no way to check the validity of the transcripts. The tool erases all audio for “data safety reasons,” according to Nabla’s chief technology officer Martin Raison. The company also claims that providers must quickly edit and approve the transcriptions (with all the extra time doctors have?), but that this system may change. Meanwhile, no one else can confirm the transcriptions are accurate because of privacy laws.



Source link

━ more like this

Relive the Commodore 64’s glory days with a slimmer, blacked-out remake

The Commodore 64 is back in black, sort of. Retro Games and Plaion Replai released a limited edition redesign of the best-selling computer,...

Chinese startup shows off a dancing humanoid robot that starts at $1,400

For roughly the same price as a flagship smartphone, you could instead buy an affordable humanoid robot that's meant for consumer and educational...

Shuttered robot vacuum maker Neato is ending cloud services sooner than planned

Starting soon, Neato robovac owners will no longer be able to control their devices using the app. Neato Robotics, which shut down in...

Apple makes the M5 MacBook Pro’s battery ever so slightly easier to replace

Just like a minor upgrade in specs, Apple's latest M5 MacBook Pro gets the slightest improvement when it comes to repairability. According to...

Putin’s nuclear weapons positioned close to NATO in ‘preparation for war’ – London Business News | Londonlovesbusiness.com

Norway’s Defence Minister Tore Sandvik has warned Vladimir Putin has positioned hi nuclear fleet miles from NATO’s border in “preparation for war.” Sandvik warned...
spot_img