ChatGPT: The Latest Buzz
Researchers from Stanford University and UC Berkeley recently conducted tests on GPT-3.5 and GPT-4, two large language models. The results showed that the performance and behavior of these models can vary significantly over time. For example, the accuracy of GPT-4 in identifying prime numbers dropped from 97.6% to 2.4% between March and June versions. Moreover, the June version also made more formatting mistakes in code generation.
Experts’ Reactions
AI expert Gary Marcus expressed his concerns about the instability of large language models, suggesting that it could be their downfall. Jim Fan, a senior scientist at Nvidia, speculated that OpenAI’s attempt to make GPT-4 safer might have compromised its usefulness and cognitive skills. However, Princeton professor Arvind Narayanan and a PhD student counter-argued that variance in behavior does not necessarily indicate a degradation in capability.
OpenAI’s Response
In response to user criticism, Peter Welinder, the vice-president of OpenAI, reassured that GPT-4 is continuously improving with each new version. He acknowledged that heavier usage may reveal previously unseen issues. Logan Kilpatrick, lead of developer relations at OpenAI, also addressed the concerns on Twitter and stated that they are actively investigating the reported issues.
Impact on Users and Businesses
ChatGPT has the potential to automate various human resources tasks, such as onboarding, training, performance management, and employee queries. However, integrating OpenAI’s APIs into business workflows requires continuous monitoring, retraining, and fine-tuning to ensure accurate and up-to-date output. The variation in AI model behavior poses a significant challenge in this regard.
A Boost for Open-Source LLMs?
Meta recently released Llama 2, the second version of its free open-source language model, providing an alternative to proprietary models like ChatGPT Plus and Google’s Bard. Additionally, Databricks Inc., led by Matei Zaharia, one of the researchers behind the tests, open-sourced its LLM called Dolly 2.0. Hugging Face’s BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is also available for researchers to utilize.
Download The Mint News App to get Daily Market Updates & Live Business News.
More
Less
Updated: 20 Jul 2023, 11:46 PM IST
Denial of responsibility! SamacharCentrl is an automatic aggregator of Global media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, and all materials to their authors. For any complaint, please reach us at – [email protected]. We will take necessary action within 24 hours.

Deepak Sen is a tech enthusiast who covers the latest technological innovations, from AI to consumer gadgets. His articles provide readers with a glimpse into the ever-evolving world of technology.