Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese AI Firm DeepSeek Unveils Open-Source V3 Model Surpassing GPT-4, Trained for $5.576M
Dec 27, 2024, 04:11 AM
Chinese AI firm DeepSeek has released DeepSeek-V3, a new language model that reportedly surpasses the performance of OpenAI's GPT-4 in various benchmarks. DeepSeek-V3, a Mixture-of-Experts (MoE) model, boasts 671 billion total parameters with 37 billion activated per token. The model was developed using 14.8 trillion high-quality tokens and trained at a cost of $5.576 million, utilizing 2.788 million H800 GPU hours and 2048 GPUs costing $5 each. It achieved scores of 88.5 on MMLU, 75.9 on MMLU-Pro, and 90.2 on MATH-500, setting new records in mathematical reasoning and coding benchmarks. DeepSeek-V3 is open-sourced, providing researchers with access to the model, research papers, and training frameworks. The company, self-funded by a hedge fund called High-Flyer, has not sought external funding and has built its team entirely from within China.
View original story
Claude 4.0 • 25%
Anthropic's New Model • 25%
GPT-5 • 25%
Other • 25%
Outside Top 10 • 25%
Top 10 • 25%
Top 3 • 25%
1st place • 25%
Top 3 • 25%
Top 5 • 25%
Outside Top 5 • 25%
Top 1 • 25%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Yes • 50%
No • 50%
Technology • 25%
Education • 25%
Healthcare • 25%
Finance • 25%