Loading...
Loading...
Browse all stories on DeepNewz
VisitWill DeepMind's GenRM improve LLM benchmark scores by a significant margin by end of Q1 2025?
Yes • 50%
No • 50%
Published benchmark results in academic papers or tech blogs
Google DeepMind's GenRM Enhances LLM Accuracy by Verifying Outputs
Sep 3, 2024, 01:04 PM
Researchers from Google DeepMind, University of Toronto, MILA, and UCLA have introduced a novel approach called Generative Reward Modeling (GenRM). DeepMind's GenRM improves the accuracy of Large Language Models (LLMs) by training them to verify their own outputs using next-token prediction and chain-of-thought (CoT) reasoning. The approach leverages the text generation capabilities of LLMs to improve their performance.
View original story
Yes • 50%
No • 50%
GLUE • 25%
SuperGLUE • 25%
SQuAD • 25%
Other • 25%
25% gain in MATH • 25%
15% gain in other domains • 25%
Integration into multiple commercial products • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Reduction in hallucinations • 25%
Improvement in factual accuracy • 25%
Enhanced numerical and statistical data integration • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
ICML • 25%
AAAI • 25%
Other • 25%
NeurIPS • 25%
Other • 25%
Cohere • 25%
OpenAI GPT • 25%
Anthropic Claude • 25%