What ranking will DeepSeek-R1 achieve in AI performance benchmarks by end of 2025?
Top 1 • 25%
Top 2-5 • 25%
Top 6-10 • 25%
Below 10 • 25%
Performance benchmark reports from AI research organizations
DeepSeek Releases MIT-Licensed 685B DeepSeek-R1 Model Rivaling OpenAI's o1 at 30x Lower Cost
Jan 20, 2025, 12:34 PM
DeepSeek has officially released its new open-source reasoning models, DeepSeek-R1 and DeepSeek-R1-Zero, licensed under the MIT License. The models, boasting 685 billion parameters, perform on par with OpenAI's o1 model across math, code, and reasoning tasks. DeepSeek-R1 achieved 71.0% pass@1 on AIME 2024, comparable to OpenAI's o1, and reached 86.7% with majority voting, surpassing OpenAI's o1. The release includes a technical report detailing a novel training pipeline that uses reinforcement learning without supervised fine-tuning, incorporating a 'language consistency reward' to improve reasoning outputs. The models are available on Hugging Face, the DeepSeek website, and API, as well as in their chat and Android/iOS apps. DeepSeek also released distilled smaller models, including one based on Qwen-1.5B that outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks with 28.9% on AIME and 83.9% on MATH. The models offer significant cost savings, with DeepSeek-R1 being up to 30 times cheaper than OpenAI's o1.
View original story
Top 3 • 25%
1st place • 25%
Outside Top 10 • 25%
Top 10 • 25%
Top 1 • 25%
Top 3 • 25%
Top 5 • 25%
Outside Top 5 • 25%
Other • 25%
MATH-500 • 25%
MMLU-Pro • 25%
MMLU • 25%
Top 50 • 25%
Top 10 • 25%
Top 20 • 25%
Outside Top 50 • 25%
Claude 4.0 • 25%
Anthropic's New Model • 25%
GPT-5 • 25%
Other • 25%
2nd Place • 25%
3rd Place • 25%
4th Place or lower • 25%
1st Place • 25%
Yes • 50%
No • 50%
Above 90% • 25%
Below 80% • 25%
80%-85% • 25%
85%-90% • 25%