Where will DeepSeek-AI models rank on HuggingFace by end of 2025?
Top 10 • 25%
Top 20 • 25%
Top 50 • 25%
Outside Top 50 • 25%
Performance rankings published by HuggingFace or other AI benchmarking platforms
DeepSeek-AI Releases 685B-Parameter Open-Source Models DeepSeek-R1 and R1-Zero, Cutting Costs by Up to 30x
Jan 20, 2025, 10:11 AM
DeepSeek-AI has released DeepSeek-R1 and DeepSeek-R1-Zero, two open-source reasoning models with performance comparable to OpenAI's o1 model. Available under the MIT license, these models boast 685 billion parameters, classified as HLLM (Huge), and are accessible on HuggingFace. Built on the architecture of DeepSeek v3, DeepSeek-R1 uses reinforcement learning during post-training to enhance reasoning abilities, achieving state-of-the-art results on benchmarks like LiveCodeBench, rivaling o1-Medium. The models are available via API as 'deepseek-reasoner' and can be used in both chat and Android/iOS apps. DeepSeek-R1 offers up to 30 times lower output token costs compared to OpenAI's o1. The release also includes six distilled dense models based on Llama and Qwen, enabling the community to distill and commercialize freely. Notably, the DeepSeek-R1-Distill-Qwen-1.5B model outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks, scoring 28.9% on AIME and 83.9% on MATH.
View original story