Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese Startup DeepSeek-V3 Sets New AI Benchmarks with 671B Parameter Open-Source Model at $5.576M
Dec 29, 2024, 03:14 PM
DeepSeek, a Chinese startup, has released DeepSeek-V3, an open-source AI model that has set new benchmarks in the field of artificial intelligence. The model, featuring 671 billion parameters, utilizes a Mixture-of-Experts (MoE) architecture that activates only 37 billion parameters for specific tasks, enhancing efficiency and performance. DeepSeek-V3 has been pre-trained on 14.8 trillion tokens and achieves a throughput of 60 tokens per second, which is three times faster than its predecessor, DeepSeek-V2. It has demonstrated superior performance in technical tasks, such as programming and mathematical problem-solving, scoring 92% on the HumanEval programming test and 90.2 on the Math-500 challenge. The model outperforms competitors like Meta's Llama 3.1, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 in coding competitions. DeepSeek-V3 is available through Hugging Face and the official website, with an API offered to enterprises at promotional pricing until February 8, 2025. The total training cost for DeepSeek-V3 was reported at $5.576 million, showcasing a cost-effective approach to AI development.
View original story
Markets
Yes • 50%
No • 50%
Official announcements from DeepSeek or major tech news outlets
Yes • 50%
No • 50%
Download statistics from Hugging Face platform
No • 50%
Yes • 50%
Official announcement from DeepSeek or press release on their website
One • 25%
More than three • 25%
Three • 25%
Two • 25%
Official announcements from DeepSeek or major tech news outlets
Outside Top 5 • 25%
Top 1 • 25%
Top 3 • 25%
Top 5 • 25%
Results from major AI competitions such as NeurIPS, CVPR, or similar
Greater than 50% • 25%
Between 30% and 50% • 25%
Between 10% and 30% • 25%
Less than 10% • 25%
Market analysis reports from firms like Gartner or IDC