Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese AI Firm DeepSeek Unveils Open-Source V3 Model Surpassing GPT-4, Trained for $5.576M
Dec 27, 2024, 04:11 AM
Chinese AI firm DeepSeek has released DeepSeek-V3, a new language model that reportedly surpasses the performance of OpenAI's GPT-4 in various benchmarks. DeepSeek-V3, a Mixture-of-Experts (MoE) model, boasts 671 billion total parameters with 37 billion activated per token. The model was developed using 14.8 trillion high-quality tokens and trained at a cost of $5.576 million, utilizing 2.788 million H800 GPU hours and 2048 GPUs costing $5 each. It achieved scores of 88.5 on MMLU, 75.9 on MMLU-Pro, and 90.2 on MATH-500, setting new records in mathematical reasoning and coding benchmarks. DeepSeek-V3 is open-sourced, providing researchers with access to the model, research papers, and training frameworks. The company, self-funded by a hedge fund called High-Flyer, has not sought external funding and has built its team entirely from within China.
View original story
Amazon • 25%
Other • 25%
Google • 25%
Microsoft • 25%
Alibaba • 25%
Other • 25%
Baidu • 25%
Tencent • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Microsoft • 25%
Other • 25%
Google • 25%
Amazon • 25%
26-50% • 25%
0-10% • 25%
51% or more • 25%
11-25% • 25%
No • 50%
Yes • 50%
Alibaba • 25%
Meta • 25%
Microsoft • 25%
Other • 25%
No • 50%
Yes • 50%
Yes • 50%
No • 50%
MATH-500 • 25%
MMLU • 25%
Other • 25%
MMLU-Pro • 25%
Technology • 25%
Education • 25%
Healthcare • 25%
Finance • 25%