Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese Startup DeepSeek-V3 Sets New AI Benchmarks with 671B Parameter Open-Source Model at $5.576M
Dec 29, 2024, 03:14 PM
DeepSeek, a Chinese startup, has released DeepSeek-V3, an open-source AI model that has set new benchmarks in the field of artificial intelligence. The model, featuring 671 billion parameters, utilizes a Mixture-of-Experts (MoE) architecture that activates only 37 billion parameters for specific tasks, enhancing efficiency and performance. DeepSeek-V3 has been pre-trained on 14.8 trillion tokens and achieves a throughput of 60 tokens per second, which is three times faster than its predecessor, DeepSeek-V2. It has demonstrated superior performance in technical tasks, such as programming and mathematical problem-solving, scoring 92% on the HumanEval programming test and 90.2 on the Math-500 challenge. The model outperforms competitors like Meta's Llama 3.1, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 in coding competitions. DeepSeek-V3 is available through Hugging Face and the official website, with an API offered to enterprises at promotional pricing until February 8, 2025. The total training cost for DeepSeek-V3 was reported at $5.576 million, showcasing a cost-effective approach to AI development.
View original story
Top 3 • 25%
Outside Top 10 • 25%
1st place • 25%
Top 10 • 25%
3rd Place • 25%
2nd Place • 25%
1st Place • 25%
4th Place or lower • 25%
MATH-500 • 25%
MMLU-Pro • 25%
Other • 25%
MMLU • 25%
Yes • 50%
No • 50%
Claude 4.0 • 25%
Other • 25%
Anthropic's New Model • 25%
GPT-5 • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
One • 25%
More than three • 25%
Three • 25%
Two • 25%