DeepNewz Markets

Market

Which research group will achieve the next significant improvement in LLM self-correction by the end of 2024?

Google DeepMind•Hacker News

Resolution / Starting Odds

Google DeepMind • 25%

OpenAI • 25%

Microsoft • 25%

Other • 25%

Published research papers or official announcements from major research groups

Story

$Google DeepMind's SCoRe Achieves 15.6% Gain in LLM Self-Correction for MATH$

Google DeepMind's SCoRe Achieves 15.6% Gain in LLM Self-Correction for MATH

Sep 20, 2024, 05:06 PM

Google DeepMind has developed a multi-turn online reinforcement learning (RL) approach to improve the self-correction capabilities of large language models (LLMs). The new method, named SCoRe, utilizes entirely self-generated data and achieves state-of-the-art performance in self-correction. This approach addresses the limitations of supervised fine-tuning (SFT), which has been found ineffective for self-correction due to a distribution mismatch. The research, titled 'Training Language Models to Self-Correct via Reinforcement Learning,' has gained significant attention, including being highlighted on Hacker News for AI papers. SCoRe achieved a 15.6% gain on self-correction for reasoning problems from MATH and a 9.1% improvement overall.

View original story

Similar markets

Quantinuum • 25%

Other • 25%

Which field will see the first major research breakthrough attributed to an LLM-generated idea by the end of 2024?

Natural Language Processing • 25%

Computer Vision • 25%

Biomedical Research • 25%

Other • 25%

Which type of task will see significant performance improvement in LLMs due to the Prover-Verifier Games approach by the end of 2024?

Math Problem Solving • 25%

Legal Document Analysis • 25%

Medical Diagnosis • 25%

Other • 25%

Which benchmark will Google DeepMind's new LLM approach top first by end of 2024?

GLUE • 25%

SuperGLUE • 25%

SQuAD • 25%

Other • 25%

What will be the most notable improvement in LLMs due to DataGemma models by end of 2024?

Reduction in hallucinations • 25%

Improvement in factual accuracy • 25%

Enhanced numerical and statistical data integration • 25%

Other • 25%

Most improved metric for LLMs using new retrieval method by end of 2024

Accuracy • 33%

Reasoning capabilities • 33%

Context handling • 33%

Which major tech company will be the first to announce a significant research initiative using LLMs for ideation by the end of 2024?

Google • 25%

Microsoft • 25%

OpenAI • 25%

Other • 25%

Market

Story

Similar markets

Will a significant breakthrough in NLP research be attributed to an idea generated by an LLM by the end of 2024?

Will Google DeepMind's new LLM approach achieve significant benchmark improvement by end of 2024?

Which company will announce the next major breakthrough in quantum computing error correction by end of 2024?

Which field will see the first major research breakthrough attributed to an LLM-generated idea by the end of 2024?

Which type of task will see significant performance improvement in LLMs due to the Prover-Verifier Games approach by the end of 2024?

Which benchmark will Google DeepMind's new LLM approach top first by end of 2024?

What will be the most notable improvement in LLMs due to DataGemma models by end of 2024?

Most improved metric for LLMs using new retrieval method by end of 2024

Which major tech company will be the first to announce a significant research initiative using LLMs for ideation by the end of 2024?

New LLM retrieval method included in benchmark leaderboard by end of 2024?

Will DeepMind's GenRM improve LLM benchmark scores by a significant margin by end of Q1 2025?

Will Google Quantum AI publish another breakthrough in quantum error correction by the end of 2024?

Will another major tech company adopt Google DeepMind's SCoRe method by the end of 2024?

Will Google DeepMind's SCoRe method achieve a 20% gain in self-correction for MATH by mid-2025?

Will Google DeepMind's SCoRe method be integrated into a commercial product by the end of Q3 2024?

What will be the next major milestone in LLM self-correction achieved by Google DeepMind by the end of 2024?

Will another major tech company adopt Google DeepMind's SCoRe method by the end of 2024?

Will Google DeepMind's SCoRe method achieve a 20% gain in self-correction for MATH by mid-2025?

Will Google DeepMind's SCoRe method be integrated into a commercial product by the end of Q3 2024?

What will be the next major milestone in LLM self-correction achieved by Google DeepMind by the end of 2024?