Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek Launches Open-Source DeepSeek-V3 Model with 671 Billion Parameters, 60 Tokens/Second Speed, and $5.6 Million Training Cost
Dec 26, 2024, 02:37 PM
DeepSeek has officially released its latest AI model, DeepSeek-V3, which features 671 billion parameters in a mixture-of-experts (MoE) architecture. This model utilizes 37 billion activated parameters and is capable of processing 60 tokens per second, making it three times faster than its predecessor, DeepSeek-V2. Notably, DeepSeek-V3 has achieved a score of 60.4 on the LiveBench benchmark, outperforming competitors such as Claude 3.5 Sonnet and GPT-4o across various tasks. The model was trained using 14.8 trillion tokens at a cost of approximately $5.6 million, significantly lower than the training costs of similar models from Western labs. DeepSeek-V3 has also demonstrated a remarkable improvement in coding capabilities, jumping from a 17% success rate in its previous version to 48% in the current version on the Aider benchmark. The model is fully open-source and available on HuggingFace, marking a notable advancement in the open-source AI landscape.
View original story
Markets
No • 50%
Yes • 50%
Aider benchmark results published on official website or press releases
Yes • 50%
No • 50%
Download statistics from HuggingFace's official website
No • 50%
Yes • 50%
LiveBench benchmark results published on official website or press releases
Natural Language Processing • 25%
Other • 25%
Data Analysis • 25%
Coding Assistance • 25%
Industry reports and surveys from AI-focused publications
GPT-5 • 25%
Other • 25%
Claude 4.0 • 25%
Anthropic's New Model • 25%
LiveBench benchmark results published on official website or press releases
Other • 25%
North America • 25%
Europe • 25%
Asia • 25%
Adoption statistics from industry reports and AI conferences