GPT-3 can reason as well as college students, study finds
Enter the AI titan: GPT-3, whose cognitive prowess is reportedly rivaling that of human college undergraduates.
[Aug. 8, 2023: Staff Writer, The Brighter Side of News]
Recent breakthrough research from UCLA psychologists suggests that this hallmark of human intelligence may not be exclusive to us. (CREDIT: Nate Edwards/BYU Photo)
In the sprawling landscape of human cognition, analogical reasoning – the art of solving new problems based on past experiences – has been a defining feature of our species. It's the innate ability to relate unfamiliar situations to those we know, leveraging past solutions to solve current issues.
However, recent breakthrough research from UCLA psychologists suggests that this hallmark of human intelligence may not be exclusive to us. Enter the AI titan: GPT-3, whose cognitive prowess is reportedly rivaling that of human college undergraduates.
GPT-3 vs. The College Student: A Match-Up of Minds
Published in Nature Human Behaviour, the study exposes the uncanny ability of GPT-3 to tackle reasoning problems of the sort that one would find on intelligence and standardized tests such as the SAT.
The revelation begs the question, as penned by the authors, “Is GPT-3 emulating human thought through the vast expanse of its training data, or is it charting a new course in cognitive processes?”
Related Stories
While the inner intricacies of GPT-3 remain a closely guarded secret by OpenAI, its creator, the researchers from UCLA are left in a realm of speculation regarding its operation. Notably, while GPT-3 astonished in certain reasoning areas, it faltered in others, revealing a spectrum of strengths and weaknesses.
Dr. Taylor Webb, the study’s primary author and a postdoctoral researcher at UCLA, underscores this point, “No matter how impressive our results, it’s crucial to remember that this system is riddled with limitations. While it can navigate analogical reasoning, it stumbles with tasks that humans find simple, like using tools for a physical task."
To probe GPT-3’s capabilities, the researchers converted visual problems from the Raven’s Progressive Matrices into textual formats that the AI could process. Forty UCLA undergraduates were also given the same problems for a comparative analysis.
As one part of the experiment, UCLA scientists prompted GPT-3 to try predicting the next image in a complicated arrangement of shapes, like in the three-by-three grid pictured here. (CREDIT: Tony Stella/UCLA)
The findings? Professor Hongjing Lu, the senior author, states, "Not only did GPT-3 perform on par with humans, but its mistakes were eerily similar too.” The AI boasted a success rate of 80%, overshadowing the average human score but paralleling the top-tier human results.
The Analogical Arena: Where AI and Human Cognition Collide
Further pushing GPT-3’s boundaries, the team presented it with previously unpublished SAT analogy questions. These problems, believed to be alien to GPT-3's training data, examine word relationships. Comparing the AI's results with college applicants’ scores revealed that GPT-3 outperformed the human average.
However, the AI's prowess seemed less consistent when confronted with analogies rooted in short narratives. While it lagged behind humans in deriving meanings from these tales, its successor, GPT-4, demonstrated superior competency.
As a backdrop to these groundbreaking experiments, the UCLA team has also been refining a proprietary computer model inspired by human cognition. Professor Keith Holyoak, who contributed to the study, shares, "Our psychological AI model reigned supreme in analogical reasoning, but the latest iteration of GPT-3 threw us a curveball, matching or even exceeding our model's capabilities.”
But GPT-3's skills aren't universal. It fumbled tasks requiring a grasp of physical space. For instance, when tasked with transferring gumballs using a set of tools, GPT-3's solutions bordered on the bizarre.
Professor Lu reflects on the meteoric evolution of the technology, “Language learning models are fundamentally about word prediction. Their prowess in reasoning is unexpected, but the advancements in the last two years are undeniable.”
On the Horizon: The Quest for True AI Intelligence
The scientific odyssey now revolves around discerning if these language models are inching closer to human-like cognition or charting an entirely novel intellectual path.
Summary of study results. (CREDIT: Nature Human Behavior)
“GPT-3's methodology might bear semblance to human cognition,” observes Professor Holyoak. “Yet, while humans didn’t evolve by consuming the internet’s entirety, GPT-3 did. The intriguing question is whether its modus operandi mirrors ours or if we're witnessing a revolutionary form of artificial intelligence.”
Realizing this vision would necessitate dissecting the AI's underlying cognitive processes, which in turn demands access to the model's core and its vast training data. The researchers assert that this will be the pivotal stride in determining AI’s future trajectory.
Dr. Webb concludes, “For a richer, more definitive exploration, access to GPT models’ backend is indispensable. Our current methodology offers a mere glimpse, and we yearn for a deeper, more conclusive insight.”
Note: Materials provided above by The Brighter Side of News. Content may be edited for style and length.
Like these kind of feel good stories? Get the Brighter Side of News' newsletter.