Loading...
Loading...
Browse all stories on DeepNewz
VisitWill another major tech company adopt Google DeepMind's SCoRe approach for their LLMs by March 31, 2024?
Yes • 50%
No • 50%
Official announcements from major tech companies or credible news reports
Google DeepMind's SCoRe Achieves 15.6% Gain in Self-Correction for Language Models
Sep 20, 2024, 01:26 PM
Google DeepMind has developed a new multi-turn chain of thought online reinforcement learning (RL) approach called SCoRe to improve the self-correction capabilities of large language models (LLMs). This method uses entirely self-generated data and has achieved state-of-the-art performance in self-correction. The approach has shown a 15.6% gain in self-correction for reasoning problems from the MATH dataset and a 9.1% improvement in other areas. The research, authored by A Kumar, V Zhuang, R Agarwal, and Y Su, suggests that training with off-the-shelf datasets for RL is less effective compared to using on-policy data from the model being fine-tuned.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Microsoft • 25%
Amazon • 25%
Meta • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
25% gain in self-correction for MATH dataset • 25%
Other • 25%
Adoption by three major tech companies • 25%
15% gain in self-correction for other datasets • 25%
Other • 25%
MATH dataset • 25%
Natural Language Processing • 25%
Computer Vision • 25%