GPT-4 is now being trained by AI to improve accuracy | Tech Reader

Date:

Share:


An OpenAI graphic for ChatGPT-4.
OpenAI

OpenAI has developed an AI assistant, dubbed CriticGPT, to help its crowd-sourced trainers further refine the GPT-4 model. It spots subtle coding errors that humans might otherwise miss.

After a large language model like GPT-4 is initially trained, it subsequently undergoes a continual process of refinement, known as Reinforcement Learning from Human Feedback (RLHF). Human trainers interact with the system and annotate the responses to various questions, as well as rate various responses against one another, so that the system learns to return the preferred response and increases the model’s response accuracy.

The problem is that as the system’s performance improves, it can outpace the level of expertise of its trainer, and the process of identifying mistakes and errors becomes increasingly difficult.

These AI trainers aren’t always subject matter experts, mind you. Last year, OpenAI got caught crowd sourcing the effort to Kenyan workers — and paying them less than $2 an hour — to improve its models’ performance.

a criticGPT screenshot
OpenAI

This issue is especially difficult when refining the system’s code generation capabilities, which is where CriticGPT comes in.

“We’ve trained a model, based on GPT-4, called CriticGPT, to catch errors in ChatGPT’s code output,” the company explained in a blog post Thursday. “We found that when people get help from CriticGPT to review ChatGPT code they outperform those without help 60 percent of the time.”

What’s more, the company released a whitepaper on the subject, titled “LLM Critics Help Catch LLM Bugs,” which found that “LLMs catch substantially more inserted bugs than qualified humans paid for code review, and further that model critiques are preferred over human critiques more than 80 percent of the time.”

Interestingly, the study also found that when humans collaborated with CriticGPT, the AI’s rate of hallucinating responses was lower than when CriticGPT did the work alone, but that rate of hallucination was still higher than if a human just did the work by themselves.








Source link

━ more like this

Two big upgrades were just confirmed for the Google Pixel Watch 3 | Tech Reader

Google’s big August 13 event is rapidly approaching. This is where we expect to see the Google Pixel Watch 3, and with just...

Braverman has ‘looked’ at Reform UK and warns Tories they face an ‘existential threat’ from Farage – London Business News | Londonlovesbusiness.com

The former Conservative Home Secretary has warned that the Tories are facing an “existential threat” from Nigel Farage and others....

Blockchain Beyond Cryptocurrencies: Real-World Applications – Insights Success

Blockchain technology, initially known for cryptocurrencies like Bitcoin and Ethereum, has grown to offer solutions across various industries. Its decentralized, transparent, and secure...

Bahaa Abdulhussein: A Visionary Businessman Transforming Iraq’s Financial Landscape – Insights Success

This article discusses Bahaa Abdulhussein, who played a significant role in the growth of the fintech industry in Iraq and how he can...

Embracing Office Elegance with Confidence – Insights Success

In the dynamic world of leadership, where every decision holds weight, a woman’s office attire speaks volumes. Office elegance goes beyond adhering to...
spot_img