Loading...
Loading...
Browse all stories on DeepNewz
VisitWill another major tech company adopt Google DeepMind's SCoRe method by the end of 2024?
Yes • 50%
No • 50%
Official announcements from major tech companies or credible news sources
Google DeepMind's SCoRe Achieves 15.6% Gain in LLM Self-Correction for MATH
Sep 20, 2024, 05:06 PM
Google DeepMind has developed a multi-turn online reinforcement learning (RL) approach to improve the self-correction capabilities of large language models (LLMs). The new method, named SCoRe, utilizes entirely self-generated data and achieves state-of-the-art performance in self-correction. This approach addresses the limitations of supervised fine-tuning (SFT), which has been found ineffective for self-correction due to a distribution mismatch. The research, titled 'Training Language Models to Self-Correct via Reinforcement Learning,' has gained significant attention, including being highlighted on Hacker News for AI papers. SCoRe achieved a 15.6% gain on self-correction for reasoning problems from MATH and a 9.1% improvement overall.
View original story
Yes • 50%
No • 50%
Microsoft • 25%
Apple • 25%
Amazon • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Microsoft • 25%
Amazon • 25%
Meta • 25%
Other • 25%
25% gain in self-correction for MATH dataset • 25%
15% gain in self-correction for other datasets • 25%
Adoption by three major tech companies • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Microsoft • 25%
Amazon • 25%
Meta • 25%
Other • 25%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Integration into multiple commercial products • 25%
25% gain in MATH • 25%
Other • 25%
15% gain in other domains • 25%
Other • 25%
Finance • 25%
Education • 25%
Healthcare • 25%