Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek Launches Open-Source V3 Model with 671 Billion Parameters, 60 Tokens/Second, and $5.6 Million Training Cost
Dec 26, 2024, 03:08 PM
DeepSeek has officially released its latest AI model, DeepSeek-V3, which features 671 billion parameters and is designed with a Mixture-of-Experts (MoE) architecture. This model is reported to be three times faster than its predecessor, achieving a processing speed of 60 tokens per second. DeepSeek-V3 has already demonstrated superior performance on the Aider polyglot leaderboard, securing second place with a score of 48%, surpassing Sonnet 3.5 and other models. The model was trained on 14.8 trillion tokens and utilizes 256 experts with only 37 billion activated parameters per token. Notably, the training cost for DeepSeek-V3 was approximately $5.6 million, a substantial reduction compared to similar models, making it a competitive alternative to proprietary systems like GPT-4 and Claude 3.5 Sonnet. The model is fully open-source and is now available on Hugging Face, reflecting a growing trend in the AI community towards accessible, high-performance models.
View original story
Markets
No • 50%
Yes • 50%
Aider polyglot leaderboard rankings published online
Yes • 50%
No • 50%
Official announcements from major AI companies or DeepSeek
No • 50%
Yes • 50%
Aider polyglot leaderboard rankings published online
Less than 100,000 • 25%
More than 1,000,000 • 25%
500,001 to 1,000,000 • 25%
100,000 to 500,000 • 25%
Download statistics from Hugging Face platform
2nd Place • 25%
4th Place or lower • 25%
1st Place • 25%
3rd Place • 25%
Aider polyglot leaderboard rankings published online
Other • 25%
Healthcare • 25%
Finance • 25%
Technology • 25%
Industry reports and announcements from companies using DeepSeek-V3