DeepSeek Releases MIT-Licensed 685B-Parameter Model Matching OpenAI o1, Up to 30x Cheaper
Jan 20, 2025, 01:14 PM
DeepSeek, an AI company, has officially released DeepSeek-R1, a fully open-source reasoning model with 685 billion parameters that claims performance on par with OpenAI's o1 model. Licensed under MIT, the model allows for free distillation and commercialization, offering up to 30 times cost reductions compared to OpenAI's o1. DeepSeek-R1 utilizes reinforcement learning techniques to enhance reasoning capabilities, with DeepSeek-R1-Zero being trained purely through RL without initial supervised fine-tuning. The model reportedly achieves comparable results to OpenAI-o1 across math, code, and reasoning tasks and supports up to 128,000 tokens. Both DeepSeek-R1 and DeepSeek-R1-Zero are available on Hugging Face, and the company has released six distilled smaller models, including 70B and 32B parameter versions based on Llama and Qwen architectures. The model is accessible via API and the DeepSeek website and apps, and a technical report detailing the model's development and training processes has been published.
View original story
No Major Award • 25%
Wins Best Performance • 25%
Wins Best Cost Efficiency • 25%
Wins Best Innovation • 25%
OpenAI leads • 25%
Other AI models lead • 25%
DeepSeek V3 leads • 25%
Meta leads • 25%
DeepSeek-V3 • 25%
Meta's Llama 3.1 • 25%
Alibaba's Qwen 2.5 • 25%
OpenAI's GPT-4o • 25%
Gemini Pro • 25%
Other • 25%
Llama 3.3 • 25%
Phi-4 • 25%
Other • 25%
Gemini Pro • 25%
Phi-4 • 25%
Llama 3.3 • 25%
Other • 25%
Google • 25%
OpenAI • 25%
Microsoft • 25%
Claude 3.5 Sonnet • 25%
Other existing model • 25%
DeepSeek V3 • 25%
A new entrant • 25%
GPT-4o • 25%
DeepSeek-V3 • 25%
Llama 3.1 405b • 25%
Claude Sonnet 3.5 • 25%
Anthropic • 25%
Neither • 25%
OpenAI • 25%
Both • 25%
Claude 3.5 Sonnet • 25%
Other • 25%
DeepSeek-V3 • 25%
GPT-4o • 25%
DeepSeek-R1 • 25%
Other • 25%
Google's AI model • 25%
OpenAI o1 • 25%