Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese Startup DeepSeek-V3 Sets New AI Benchmarks with 671B Parameter Open-Source Model at $5.576M
Dec 29, 2024, 03:14 PM
DeepSeek, a Chinese startup, has released DeepSeek-V3, an open-source AI model that has set new benchmarks in the field of artificial intelligence. The model, featuring 671 billion parameters, utilizes a Mixture-of-Experts (MoE) architecture that activates only 37 billion parameters for specific tasks, enhancing efficiency and performance. DeepSeek-V3 has been pre-trained on 14.8 trillion tokens and achieves a throughput of 60 tokens per second, which is three times faster than its predecessor, DeepSeek-V2. It has demonstrated superior performance in technical tasks, such as programming and mathematical problem-solving, scoring 92% on the HumanEval programming test and 90.2 on the Math-500 challenge. The model outperforms competitors like Meta's Llama 3.1, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 in coding competitions. DeepSeek-V3 is available through Hugging Face and the official website, with an API offered to enterprises at promotional pricing until February 8, 2025. The total training cost for DeepSeek-V3 was reported at $5.576 million, showcasing a cost-effective approach to AI development.
View original story
Claude 4.0 • 25%
Anthropic's New Model • 25%
Other • 25%
GPT-5 • 25%
Yes • 50%
No • 50%
Meta • 25%
Microsoft • 25%
Other • 25%
Alibaba • 25%
26-50% • 25%
0-10% • 25%
51% or more • 25%
11-25% • 25%
Outside Top 10 • 25%
Top 10 • 25%
1st place • 25%
Top 3 • 25%
500,001 to 1,000,000 • 25%
100,000 to 500,000 • 25%
Less than 100,000 • 25%
More than 1,000,000 • 25%
Amazon • 25%
Other • 25%
Google • 25%
Microsoft • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Outside Top 5 • 25%
Top 1 • 25%
Top 3 • 25%
Top 5 • 25%