Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be the primary use case for DeepSeek-V3 by end of 2025?
Natural Language Processing • 25%
Coding Assistance • 25%
Data Analysis • 25%
Other • 25%
Industry reports and surveys from AI-focused publications
DeepSeek Launches Open-Source DeepSeek-V3 Model with 671 Billion Parameters, 60 Tokens/Second Speed, and $5.6 Million Training Cost
Dec 26, 2024, 02:37 PM
DeepSeek has officially released its latest AI model, DeepSeek-V3, which features 671 billion parameters in a mixture-of-experts (MoE) architecture. This model utilizes 37 billion activated parameters and is capable of processing 60 tokens per second, making it three times faster than its predecessor, DeepSeek-V2. Notably, DeepSeek-V3 has achieved a score of 60.4 on the LiveBench benchmark, outperforming competitors such as Claude 3.5 Sonnet and GPT-4o across various tasks. The model was trained using 14.8 trillion tokens at a cost of approximately $5.6 million, significantly lower than the training costs of similar models from Western labs. DeepSeek-V3 has also demonstrated a remarkable improvement in coding capabilities, jumping from a 17% success rate in its previous version to 48% in the current version on the Aider benchmark. The model is fully open-source and available on HuggingFace, marking a notable advancement in the open-source AI landscape.
View original story
Data Analysis • 25%
Coding • 25%
Other • 25%
Natural Language Processing • 25%
Customer Service • 25%
Other • 25%
Research and Development • 25%
Content Creation • 25%
Finance • 25%
Healthcare • 25%
Technology • 25%
Other • 25%
Technology • 25%
Healthcare • 25%
Finance • 25%
Education • 25%
Amazon • 25%
Microsoft • 25%
Other • 25%
Google • 25%
Competitive pressure • 25%
Regulatory challenges • 25%
Performance superiority • 25%
Cost efficiency • 25%
Other • 25%
Google • 25%
Microsoft • 25%
Amazon • 25%
26-50% • 25%
51% or more • 25%
11-25% • 25%
0-10% • 25%
Europe • 25%
Other • 25%
North America • 25%
Asia • 25%
Other • 25%
North America • 25%
Europe • 25%
Asia • 25%
GPT-5 • 25%
Other • 25%
Claude 4.0 • 25%
Anthropic's New Model • 25%