Why does AI keep making things up that aren't true?
It all comes down to how these systems are trained, according to researchers.
ChatGPT is the most well-known and widely used large language model. Still, it can provide answers that contain possible errors.(Photo: Matt Rourke, AP, NTB)
You may have experienced it yourself. Sometimes one of the large AI models gives you answers that appear to be correct.
It may seem confident, but it can turn out to be completely made up.
This is a fundamental problem in language models, and hallucinations continue to be an issue even in the most advanced models, according to an article written by several researchers at OpenAI, the company behind the most well-known language model, ChatGPT.
The article is available on the preprint archive Arxiv, but it has not been peer-reviewed.
Still, it has sparked debate among researchers about how to handle uncertainty in language models, Erik Velldal tells Science Norway.
Annonse
He is a professor at the University of Oslo's Department of Infomatics.
Another researcher believes that OpenAI's proposed solution could completely undermine people's relationship with these AI models. More on that later.
But first – how do these hallucinations actually happen? And what should the models do when they don't have a good answer to a question?
Erik Velldal, professor of informatics.(Photo: University of Oslo)
"Things can go wrong"
Hallucinations are a completely normal part of how language models work, Velldal tells Science Norway.
"The models are probability distributions over sequences of words, not databases of facts about the world," he says.
In other words, the sentences produced by models like ChatGPT are the result of probability calculations. Each word is chosen based on what’s most likely to follow the previous one.
As a result, things can occasionally go off track.
"Things can go wrong, especially when it comes to topics that are poorly represented in their training data," says Velldal.
That’s when models might start producing answers that sound plausible but aren’t true. For example, fictional studies.
"It can make up something that sounds likely, invent both titles and references that seem real. Maybe it even uses the names of real researchers as authors," he says.
The researchers behind the article note that during training, models are rewarded for guessing, but not for expressing uncertainty.
Multiple choice
When AI models are trained, different types of tests are used to evaluate how good they are. These can also be multiple-choice-type tests where the models have clearly defined answer options to choose from.
Annonse
This can be compared to humans taking such tests, the researchers write in the article. It's better to guess, because then there's at least a chance to get it right.
"The problem is that the model is not rewarded for acknowledging that there's something it doesn't know and therefore just guesses," says Velldal.
The researchers suggest introducing an 'I don't know' option.
This would allow models to learn to express uncertainty rather than making things up. Velldal believes this could lead to fewer hallucinations – if it becomes part of how models are evaluated during training.
However, he adds that this approach might not easily apply to longer, open-ended tasks such as writing essays or summarising research.
"When people ask models to summarise knowledge or write about a certain topic, it’s much harder to see how that proposal would work," says Velldal.
Still a persistent problem
Hallucinations remain an ongoing challenge.
A very recent example comes from the Norwegian Broadcasting Corporation NRK, which let language models answer news-related questions from public broadcaster websites.
45 per cent of the answers contained significant errors, including made-up news articles with fake links, according to NRK (link in Norwegian).
Even so, Velldal says the situation has improved over the past year.
"That's mainly because language models are increasingly used in combination with internet searches and external tools. That helps ground the answers in real information," he says.
But should the models get better at communicating uncertainty?
According to the data presented in the article, he argues this approach could result in a large language model beginning its responses with I don't know roughly one-third of the time.
"Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly," he believes.
Velldal does not entirely agree.
"Of course, people want clear answers, but not if they're wrong. I’d prefer the model to admit it doesn’t know, but it also shouldn’t become overly cautious," he says.
Velldal points out that a model that too often responds with 'I can't help you with that' out of fear of being wrong won't be very useful either.