Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek Launches V3 AI Model, Outperforming OpenAI with 671 Billion Parameters, $5.576 Million Training Cost, and 60 Tokens/sec
Dec 28, 2024, 11:55 PM
Chinese AI startup DeepSeek has launched its latest model, DeepSeek V3, which is reported to outperform established models from tech giants such as OpenAI and Meta. The model boasts 671 billion parameters and utilizes a mixture of experts (MoE) architecture, with approximately 37 billion active parameters. Notably, DeepSeek V3 was trained on a budget of $5.576 million, significantly lower than typical costs for such advanced models, which can reach hundreds of millions. The training utilized 2048 H800 GPUs, described as 'nerfed H100s,' over a two-month period. This development challenges the prevailing assumption that substantial financial resources are necessary for serious AI development. DeepSeek's model is also characterized by its open-source nature and competitive pricing, running at 60 tokens per second for a cost as low as $2 per day. The model's performance has led to comparisons with OpenAI's ChatGPT, with claims that it can replicate outputs from the latter in various tests. Experts are now questioning the implications of this development for the future of AI and the value of high-end computing hardware, particularly Nvidia chips, in light of DeepSeek's efficient approach.
View original story
Markets
No • 50%
Yes • 50%
Official government announcements or credible news sources
No • 50%
Yes • 50%
Official announcements from DeepSeek or partnering company
Yes • 50%
No • 50%
Results published in AI benchmark reports or credible tech publications
26-50% • 25%
51% or more • 25%
0-10% • 25%
11-25% • 25%
Surveys or reports from credible tech industry analysts
Regulatory challenges • 25%
Cost efficiency • 25%
Competitive pressure • 25%
Performance superiority • 25%
Analyses from industry experts or tech publications
Other AI models lead • 25%
DeepSeek V3 leads • 25%
OpenAI leads • 25%
Meta leads • 25%
Reports from market research firms or credible tech industry analyses