Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be the primary application domain for Google DeepMind's SCoRe method by the end of 2024?
Education • 25%
Healthcare • 25%
Finance • 25%
Other • 25%
Official announcements from Google DeepMind or credible news sources
Google DeepMind's SCoRe Achieves 15.6% Gain in LLM Self-Correction for MATH
Sep 20, 2024, 05:06 PM
Google DeepMind has developed a multi-turn online reinforcement learning (RL) approach to improve the self-correction capabilities of large language models (LLMs). The new method, named SCoRe, utilizes entirely self-generated data and achieves state-of-the-art performance in self-correction. This approach addresses the limitations of supervised fine-tuning (SFT), which has been found ineffective for self-correction due to a distribution mismatch. The research, titled 'Training Language Models to Self-Correct via Reinforcement Learning,' has gained significant attention, including being highlighted on Hacker News for AI papers. SCoRe achieved a 15.6% gain on self-correction for reasoning problems from MATH and a 9.1% improvement overall.
View original story
MATH dataset • 25%
Natural Language Processing • 25%
Computer Vision • 25%
Other • 25%
25% gain in self-correction for MATH dataset • 25%
15% gain in self-correction for other datasets • 25%
Adoption by three major tech companies • 25%
Other • 25%
Mathematics • 25%
Healthcare • 25%
Natural Language Processing • 25%
Other • 25%
Microsoft • 25%
Apple • 25%
Amazon • 25%
Other • 25%
Yes • 50%
No • 50%
International Mathematical Olympiad • 25%
Kaggle Competition • 25%
DARPA Challenge • 25%
Other • 25%
Healthcare AI • 25%
Natural Language Processing • 25%
Autonomous Systems • 25%
Other • 25%
Healthcare • 25%
Finance • 25%
Technology • 25%
Other • 25%
Chatbots • 25%
Translation Services • 25%
Content Generation • 25%
Other • 25%
Gold medal at 2025 IMO • 25%
Winning a Kaggle competition • 25%
Breakthrough in protein folding • 25%
Other • 25%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Integration into multiple commercial products • 25%
25% gain in MATH • 25%
Other • 25%
15% gain in other domains • 25%