OpenAI’s Whisper transcription tool has hallucination issues, researchers say.

According to a report from the Associated Press, software engineers, developers, and academic researchers have serious concerns about OpenAI’s transcription of Whisper.

There’s been no shortage of discussion about generative AI’s tendency to hallucinate (basically creating things), but it’s a bit surprising that this is an issue in transcription. This is because you can expect the transcription to closely follow the audio being transcribed.

Instead, researchers told the AP that Whisper introduced everything from racist comments to fictitious medical treatments into the transcripts. And this could be especially disastrous as Whisper is being adopted in hospitals and other healthcare settings.

A University of Michigan researcher studying public meetings found hallucinations in eight out of 10 audio transcripts. A machine learning engineer studied over 100 hours of whispering transcripts and found hallucinations in more than half. And one developer reported that he found hallucinations in almost all of the 26,000 transcripts he created with Whisper.

An OpenAI spokesperson said the company is “continuously working to improve the accuracy of its models, including reducing hallucinations,” and noted its usage policy prohibits the use of Whisper “in certain high-stakes decision-making situations.”

“We thank the researchers for sharing their findings,” they said.