Monday, March 27, 2023

Reasons behind excellent performance of Large Language Models.



The superb functionalities of Large Language Models (LLMs) such as ChatGPT, GPT4, Bard, etc. would a puzzler even for people who have been optimistic about the potentials of artificial intelligence.  


The fact that at this stage artificial intelligence systems have achieved this level of perceived success would tell a lot about natural language as well as AI.


The way natural language is organized, no matter what word orders we generate or receive, they would be accepted as good as long as certain grammatical rules and contextual constraints are satisfied. Within that particular domain, anything goes.


So this is presumably how it works. Once the LLMs have studied the statistical patterns on the available texts on the web (which have been generated by humans), they would be able to produce endless examples of word sequences satisfying the contextual constraints specified by the prompt, while doing OK grammatically. 


The fact that AI systems with the present level of sophistication can generate texts perceived to be proper and good is thus a glimpse into the nature of natural language itself. While the achievement is certainly remarkable, it remains to be seen whether that would be considered as a hallmark of artificial general intelligence given the incredible flexibility of the natural language system within a contextual constraint, which has been studied and exploited by the LLMs.


In addition, the emergence of complexity exhibited in the word sequences produced by LLMs would qualify as trajectories in life histories. In life, we make choices and take actions, satisfying certain constraints while remaining interestingly unpredictable. If the choices and actions become too predictable, they would be taken advantage of by other players in the great game of life.


From this point of view, the outputs of LLMs could be taken as exhibitions of life histories by artificial intelligence systems in terms of the word orders generated.


(A short summary of the arguments in Ken Mogi's Street Brain Radio episode 28: The reasons behind the excellence of Large Language Models) 






No comments: