Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich research group will achieve the next significant improvement in LLM self-correction by the end of 2024?
Google DeepMind • 25%
OpenAI • 25%
Microsoft • 25%
Other • 25%
Published research papers or official announcements from major research groups
Google DeepMind's SCoRe Achieves 15.6% Gain in LLM Self-Correction for MATH
Sep 20, 2024, 05:06 PM
Google DeepMind has developed a multi-turn online reinforcement learning (RL) approach to improve the self-correction capabilities of large language models (LLMs). The new method, named SCoRe, utilizes entirely self-generated data and achieves state-of-the-art performance in self-correction. This approach addresses the limitations of supervised fine-tuning (SFT), which has been found ineffective for self-correction due to a distribution mismatch. The research, titled 'Training Language Models to Self-Correct via Reinforcement Learning,' has gained significant attention, including being highlighted on Hacker News for AI papers. SCoRe achieved a 15.6% gain on self-correction for reasoning problems from MATH and a 9.1% improvement overall.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Google • 25%
IBM • 25%
Quantinuum • 25%
Other • 25%
Natural Language Processing • 25%
Computer Vision • 25%
Biomedical Research • 25%
Other • 25%
Math Problem Solving • 25%
Legal Document Analysis • 25%
Medical Diagnosis • 25%
Other • 25%
GLUE • 25%
SuperGLUE • 25%
SQuAD • 25%
Other • 25%
Reduction in hallucinations • 25%
Improvement in factual accuracy • 25%
Enhanced numerical and statistical data integration • 25%
Other • 25%
Accuracy • 33%
Reasoning capabilities • 33%
Context handling • 33%
Google • 25%
Microsoft • 25%
OpenAI • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Integration into multiple commercial products • 25%
25% gain in MATH • 25%
Other • 25%
15% gain in other domains • 25%