Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be the next major milestone in LLM self-correction achieved by Google DeepMind by the end of 2024?
25% gain in MATH • 25%
15% gain in other domains • 25%
Integration into multiple commercial products • 25%
Other • 25%
Published research papers or official announcements from Google DeepMind
Google DeepMind's SCoRe Achieves 15.6% Gain in LLM Self-Correction for MATH
Sep 20, 2024, 05:06 PM
Google DeepMind has developed a multi-turn online reinforcement learning (RL) approach to improve the self-correction capabilities of large language models (LLMs). The new method, named SCoRe, utilizes entirely self-generated data and achieves state-of-the-art performance in self-correction. This approach addresses the limitations of supervised fine-tuning (SFT), which has been found ineffective for self-correction due to a distribution mismatch. The research, titled 'Training Language Models to Self-Correct via Reinforcement Learning,' has gained significant attention, including being highlighted on Hacker News for AI papers. SCoRe achieved a 15.6% gain on self-correction for reasoning problems from MATH and a 9.1% improvement overall.
View original story
Yes • 50%
No • 50%
25% gain in self-correction for MATH dataset • 25%
15% gain in self-correction for other datasets • 25%
Adoption by three major tech companies • 25%
Other • 25%
GLUE • 25%
SuperGLUE • 25%
SQuAD • 25%
Other • 25%
New AI system for medical research • 25%
New AI system for financial modeling • 25%
New AI system for climate modeling • 25%
Other • 25%
Gold medal at 2025 IMO • 25%
Winning a Kaggle competition • 25%
Breakthrough in protein folding • 25%
Other • 25%
International Mathematical Olympiad • 25%
Kaggle Competition • 25%
DARPA Challenge • 25%
Other • 25%
Google Search • 25%
Google Assistant • 25%
Google Cloud AI • 25%
Other • 25%
Yes • 50%
No • 50%
Level 2: Reasoners • 25%
Level 3: Agents • 25%
Level 4: Innovators • 25%
Level 5: Organizations • 25%
Wins a match against a professional player • 25%
Achieves 70% win rate against intermediate players • 25%
Participates in a national tournament • 25%
Other • 25%
Surpasses human performance in a specific task • 25%
Achieves a breakthrough in natural language processing • 25%
Reaches a new level of general AI • 25%
Other • 25%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Other • 25%
Finance • 25%
Education • 25%
Healthcare • 25%