Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese AI Firm DeepSeek Unveils Open-Source V3 Model Surpassing GPT-4, Trained for $5.576M
Dec 27, 2024, 04:11 AM
Chinese AI firm DeepSeek has released DeepSeek-V3, a new language model that reportedly surpasses the performance of OpenAI's GPT-4 in various benchmarks. DeepSeek-V3, a Mixture-of-Experts (MoE) model, boasts 671 billion total parameters with 37 billion activated per token. The model was developed using 14.8 trillion high-quality tokens and trained at a cost of $5.576 million, utilizing 2.788 million H800 GPU hours and 2048 GPUs costing $5 each. It achieved scores of 88.5 on MMLU, 75.9 on MMLU-Pro, and 90.2 on MATH-500, setting new records in mathematical reasoning and coding benchmarks. DeepSeek-V3 is open-sourced, providing researchers with access to the model, research papers, and training frameworks. The company, self-funded by a hedge fund called High-Flyer, has not sought external funding and has built its team entirely from within China.
View original story
Markets
No • 50%
Yes • 50%
Official announcements or press releases from top 10 global tech companies
No • 50%
Yes • 50%
Official announcements from DeepSeek or updates on their official website
Yes • 50%
No • 50%
Results from independent AI benchmarking organizations or academic publications
MATH-500 • 25%
MMLU • 25%
Other • 25%
MMLU-Pro • 25%
Results from AI benchmark tests published by credible sources
Technology • 25%
Education • 25%
Healthcare • 25%
Finance • 25%
Industry reports or case studies highlighting the use of DeepSeek-V3
Other • 25%
North America • 25%
Europe • 25%
Asia-Pacific • 25%
Reports from market research firms or industry analysis publications