Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese AI Firm DeepSeek Unveils Open-Source V3 Model Surpassing GPT-4, Trained for $5.576M
Dec 27, 2024, 04:11 AM
Chinese AI firm DeepSeek has released DeepSeek-V3, a new language model that reportedly surpasses the performance of OpenAI's GPT-4 in various benchmarks. DeepSeek-V3, a Mixture-of-Experts (MoE) model, boasts 671 billion total parameters with 37 billion activated per token. The model was developed using 14.8 trillion high-quality tokens and trained at a cost of $5.576 million, utilizing 2.788 million H800 GPU hours and 2048 GPUs costing $5 each. It achieved scores of 88.5 on MMLU, 75.9 on MMLU-Pro, and 90.2 on MATH-500, setting new records in mathematical reasoning and coding benchmarks. DeepSeek-V3 is open-sourced, providing researchers with access to the model, research papers, and training frameworks. The company, self-funded by a hedge fund called High-Flyer, has not sought external funding and has built its team entirely from within China.
View original story
Other • 25%
Healthcare • 25%
Finance • 25%
Technology • 25%
Customer Service • 25%
Other • 25%
Research and Development • 25%
Content Creation • 25%
Data Analysis • 25%
Coding Assistance • 25%
Natural Language Processing • 25%
Other • 25%
Coding • 25%
Natural Language Processing • 25%
Data Analysis • 25%
Other • 25%
Baidu • 25%
Tencent • 25%
Alibaba • 25%
Other • 25%
Amazon • 25%
Microsoft • 25%
Google • 25%
Other • 25%
Other • 25%
Amazon • 25%
Google • 25%
Microsoft • 25%
Asia • 25%
Other • 25%
Europe • 25%
North America • 25%
Microsoft • 25%
Alibaba • 25%
Meta • 25%
Other • 25%
No • 50%
Yes • 50%
Yes • 50%
No • 50%
MATH-500 • 25%
MMLU • 25%
Other • 25%
MMLU-Pro • 25%