Loading...
Loading...
Browse all stories on DeepNewz
VisitMeta Boosts AI Model Accuracy with New Iterative RPO Method
May 1, 2024, 02:17 AM
Meta has recently developed and applied a new method called Iterative Reasoning Preference Optimization (Iterative RPO) to enhance the reasoning capabilities of its AI models, specifically the Llama-2-70B-Chat. This method involves generating chain-of-thought candidates with a large language model, constructing preference pairs based on the correctness of answers, and training the model accordingly. Significant improvements were noted in model accuracy across various benchmarks: GSM8K (from 55.6% to 81.6%), MATH (from 12.5% to 20.8%), and ARC-Challenge (from 77.8% to 86.7%). Additionally, the LLM2Vec approach was applied to the Meta-Llama-3-8B model, enhancing its performance on embedding tasks.
View original story
Markets
No • 50%
Yes • 50%
Public announcements by major tech companies or credible tech news outlets
Yes • 50%
No • 50%
Published AI research papers or official Meta announcements
No • 50%
Yes • 50%
Published AI research papers or official Meta announcements
ARC-Challenge • 34%
GSM8K • 33%
MATH • 33%
Published AI research papers or official Meta announcements
Llama-2-70B-Chat • 50%
Meta-Llama-3-8B • 50%
Published AI research papers or official Meta announcements
Microsoft • 25%
Apple • 25%
Amazon • 25%
Google • 25%
Public announcements by major tech companies or credible tech news outlets