Loading...
Loading...
Browse all stories on DeepNewz
VisitWill DeepSeek-V3 achieve first place on Aider leaderboard by end of 2025?
Yes • 50%
No • 50%
Aider polyglot leaderboard rankings published online
DeepSeek Launches Open-Source V3 Model with 671 Billion Parameters, 60 Tokens/Second, and $5.6 Million Training Cost
Dec 26, 2024, 03:08 PM
DeepSeek has officially released its latest AI model, DeepSeek-V3, which features 671 billion parameters and is designed with a Mixture-of-Experts (MoE) architecture. This model is reported to be three times faster than its predecessor, achieving a processing speed of 60 tokens per second. DeepSeek-V3 has already demonstrated superior performance on the Aider polyglot leaderboard, securing second place with a score of 48%, surpassing Sonnet 3.5 and other models. The model was trained on 14.8 trillion tokens and utilizes 256 experts with only 37 billion activated parameters per token. Notably, the training cost for DeepSeek-V3 was approximately $5.6 million, a substantial reduction compared to similar models, making it a competitive alternative to proprietary systems like GPT-4 and Claude 3.5 Sonnet. The model is fully open-source and is now available on Hugging Face, reflecting a growing trend in the AI community towards accessible, high-performance models.
View original story
Top 10 • 25%
1st place • 25%
Top 3 • 25%
Outside Top 10 • 25%
Top 3 • 25%
Top 5 • 25%
Outside Top 5 • 25%
Top 1 • 25%
MMLU • 25%
Other • 25%
MATH-500 • 25%
MMLU-Pro • 25%
Yes • 50%
No • 50%
Less than 100,000 • 25%
More than 1,000,000 • 25%
500,001 to 1,000,000 • 25%
100,000 to 500,000 • 25%
2nd Place • 25%
4th Place or lower • 25%
1st Place • 25%
3rd Place • 25%