DeepSeek-AI Releases 685B-Parameter Open-Source Models DeepSeek-R1 and R1-Zero, Cutting Costs by Up to 30x
Jan 20, 2025, 10:11 AM
DeepSeek-AI has released DeepSeek-R1 and DeepSeek-R1-Zero, two open-source reasoning models with performance comparable to OpenAI's o1 model. Available under the MIT license, these models boast 685 billion parameters, classified as HLLM (Huge), and are accessible on HuggingFace. Built on the architecture of DeepSeek v3, DeepSeek-R1 uses reinforcement learning during post-training to enhance reasoning abilities, achieving state-of-the-art results on benchmarks like LiveCodeBench, rivaling o1-Medium. The models are available via API as 'deepseek-reasoner' and can be used in both chat and Android/iOS apps. DeepSeek-R1 offers up to 30 times lower output token costs compared to OpenAI's o1. The release also includes six distilled dense models based on Llama and Qwen, enabling the community to distill and commercialize freely. Notably, the DeepSeek-R1-Distill-Qwen-1.5B model outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks, scoring 28.9% on AIME and 83.9% on MATH.
View original story
Markets
Yes • 50%
No • 50%
Official announcements or credible reports from top 10 tech companies (e.g., Google, Microsoft, Amazon, etc.)
No • 50%
Yes • 50%
Benchmark results published on reputable AI research platforms or conferences
No • 50%
Yes • 50%
Benchmark scores published on LiveCodeBench website or reputable AI research publications
Less than 1 million requests • 25%
More than 10 million requests • 25%
5 to 10 million requests • 25%
1 to 5 million requests • 25%
API usage statistics released by DeepSeek-AI or third-party analytics services
10,000 to 50,000 downloads • 25%
More than 100,000 downloads • 25%
Less than 10,000 downloads • 25%
50,000 to 100,000 downloads • 25%
Adoption statistics or download numbers from HuggingFace platform
Outside Top 50 • 25%
Top 10 • 25%
Top 20 • 25%
Top 50 • 25%
Performance rankings published by HuggingFace or other AI benchmarking platforms