Is ChatGPT’s intelligence declining? Math scores at only 2.4% suggest so.

ChatGPT: The Latest Buzz

Researchers from Stanford University and UC Berkeley recently conducted tests on GPT-3.5 and GPT-4, two large language models. The results showed that the performance and behavior of these models can vary significantly over time. For example, the accuracy of GPT-4 in identifying prime numbers dropped from 97.6% to 2.4% between March and June versions. Moreover, the June version also made more formatting mistakes in code generation.

Experts’ Reactions

AI expert Gary Marcus expressed his concerns about the instability of large language models, suggesting that it could be their downfall. Jim Fan, a senior scientist at Nvidia, speculated that OpenAI’s attempt to make GPT-4 safer might have compromised its usefulness and cognitive skills. However, Princeton professor Arvind Narayanan and a PhD student counter-argued that variance in behavior does not necessarily indicate a degradation in capability.

OpenAI’s Response

In response to user criticism, Peter Welinder, the vice-president of OpenAI, reassured that GPT-4 is continuously improving with each new version. He acknowledged that heavier usage may reveal previously unseen issues. Logan Kilpatrick, lead of developer relations at OpenAI, also addressed the concerns on Twitter and stated that they are actively investigating the reported issues.

Impact on Users and Businesses

ChatGPT has the potential to automate various human resources tasks, such as onboarding, training, performance management, and employee queries. However, integrating OpenAI’s APIs into business workflows requires continuous monitoring, retraining, and fine-tuning to ensure accurate and up-to-date output. The variation in AI model behavior poses a significant challenge in this regard.

A Boost for Open-Source LLMs?

Meta recently released Llama 2, the second version of its free open-source language model, providing an alternative to proprietary models like ChatGPT Plus and Google’s Bard. Additionally, Databricks Inc., led by Matei Zaharia, one of the researchers behind the tests, open-sourced its LLM called Dolly 2.0. Hugging Face’s BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is also available for researchers to utilize.

Catch all the Technology News and Updates on Live Mint.
Download The Mint News App to get Daily Market Updates & Live Business News.

More
Less

Updated: 20 Jul 2023, 11:46 PM IST

 

Reference

Denial of responsibility! SamacharCentrl is an automatic aggregator of Global media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, and all materials to their authors. For any complaint, please reach us at – [email protected]. We will take necessary action within 24 hours.
Denial of responsibility! Samachar Central is an automatic aggregator of Global media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, and all materials to their authors. For any complaint, please reach us at – [email protected]. We will take necessary action within 24 hours.
DMCA compliant image

Leave a Comment