Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek Launches V3 AI Model, Outperforming OpenAI with 671 Billion Parameters, $5.576 Million Training Cost, and 60 Tokens/sec
Dec 28, 2024, 11:55 PM
Chinese AI startup DeepSeek has launched its latest model, DeepSeek V3, which is reported to outperform established models from tech giants such as OpenAI and Meta. The model boasts 671 billion parameters and utilizes a mixture of experts (MoE) architecture, with approximately 37 billion active parameters. Notably, DeepSeek V3 was trained on a budget of $5.576 million, significantly lower than typical costs for such advanced models, which can reach hundreds of millions. The training utilized 2048 H800 GPUs, described as 'nerfed H100s,' over a two-month period. This development challenges the prevailing assumption that substantial financial resources are necessary for serious AI development. DeepSeek's model is also characterized by its open-source nature and competitive pricing, running at 60 tokens per second for a cost as low as $2 per day. The model's performance has led to comparisons with OpenAI's ChatGPT, with claims that it can replicate outputs from the latter in various tests. Experts are now questioning the implications of this development for the future of AI and the value of high-end computing hardware, particularly Nvidia chips, in light of DeepSeek's efficient approach.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Tencent • 25%
Alibaba • 25%
Baidu • 25%
Other • 25%
Meta • 25%
Alibaba • 25%
Microsoft • 25%
Other • 25%
Google • 25%
Other • 25%
Microsoft • 25%
Amazon • 25%
Microsoft • 25%
Other • 25%
Google • 25%
Amazon • 25%
Amazon • 25%
Google • 25%
Other • 25%
Microsoft • 25%
No • 50%
Yes • 50%
26-50% • 25%
51% or more • 25%
0-10% • 25%
11-25% • 25%
Regulatory challenges • 25%
Cost efficiency • 25%
Competitive pressure • 25%
Performance superiority • 25%