A new study led by the Massachusetts Institute of Technology (MIT) has found that artificial intelligence models of language that excel at predicting the next word in a string of text are surprisingly similar to the human brain, which appears to use next-word prediction to drive language processing. These predictive language models are able to perform tasks that require some degree of genuine understanding, such as question answering, story completion, or document summarization.
Although such computer models were designed to optimize performance in search engines or texting apps, without attempting to mimic human cognition, this new study suggests that their underlying functioning resembles the ways in which language-processing centers in the human brain work.
“The better the model is at predicting the next word, the more closely it fits the human brain,” said study co-author Nancy Kanwisher, a professor of Cognitive Neuroscience at MIT. “It’s amazing that the models fit so well, and it very indirectly suggests that maybe what the human language system is doing is predicting what’s going to happen next.”
The researchers analyzed 43 different language models, including some that were optimized for next-word prediction. By comparing the functioning of these artificial models with neuronal patterns activated when humans listened to stories or read sentences, they discovered that the best-performing next-word prediction models had activity patterns closely resembling those found in human brains.
“We found that the models that predict the neural responses well also tend to best predict human behavior responses, in the form of reading times. And then both of these are explained by the model performance on next-word prediction. This triangle really connects everything together,” explained study first author Martin Schrimpf, an MIT graduate student.
The researchers now plan to build variants of these language processing models that can perform other kinds of tasks, such as the construction of perceptual representations of the physical world.
“If we’re able to understand what these language models do and how they can connect to models which do things that are more like perceiving and thinking, then that can give us more integrative models of how things work in the brain,” explained study senior author Joshua Tenenbaum.
“This could take us toward better artificial intelligence models, as well as giving us better models of how more of the brain works and how general intelligence emerges, than we’ve had in the past.”
The study is published in the journal Proceedings of the National Academy of Sciences.