Ilya Sutskever explains how LLMs can predict the next word shows real understanding
2 mins read

Ilya Sutskever explains how LLMs can predict the next word shows real understanding

LLMs have produced some remarkable results in recent years, but their detractors have said they’re not quite that impressive – given how LLMs only work by recursively predicting the next word in their answers, some people say they’re a bit more than stochastic parrots and have no real reasoning But one of the most prominent voices in the field believes otherwise.

Former OpenAI Chief Scientist Ilya Sutskever believes that simply predicting the next words can be evidence of a high level of plausibility. “(I will) give an analogy that hopefully will clarify why more accurate prediction of the next word leads to more understanding — real understanding,” he said in an interview.

“Let’s take an example. Say you’re reading a detective story. It’s like a complicated plot, a storyline, different characters, lots of events. Mysteries, like clues, it’s unclear. Then let’s say that on the last page of the book has the detective gathered all the clues, gathered all the people and said, ‘Okay, I’m going to reveal the identity of the person who committed the crime. And that person’s name is — now predict that word,'” he said.

Ilya Sutskever seemed to say that it was not trivial to predict the next word in this case – the name of the criminal. To correctly predict the next word, the LLM would need to be able to absorb all the data fed into it, understand relationships, pick up on small clues, and finally come to a conclusion about who the criminal might be. Sutskever seemed to say that this represented real reasoning power.

Sutskever has been bullish on the core technology behind LLMs for a while. More than a decade ago, he said that scaling up neural networks could lead to serious advances in generating intelligence. As more sophisticated hardware became available, neural nets became steadily more powerful and advances such as the discovery of transformer architecture helped create extremely powerful large language models. And with people like NVIDIA CEO Jensen Huang says that AI is growing 8 times faster than even Moore’s Law, it seems that LLMs’ ability to simply predict the next could take us very far on our journey towards superintelligence.