Loading...
Loading...
Browse all stories on DeepNewz
VisitChinese AI Startup DeepSeek Releases 671B Parameter Model V3, Outperforming Competitors with Less Compute Power Amid US Sanctions
Dec 30, 2024, 06:20 AM
Chinese AI startup DeepSeek has released DeepSeek-V3, an open-source large language model (LLM) with 671 billion parameters, utilizing a Mixture-of-Experts (MoE) architecture. The model activates only 37B parameters per token, aiming for efficiency and performance. DeepSeek-V3 has been pre-trained on 14.8 trillion tokens and achieves a throughput of 60 tokens per second, three times faster than its predecessor, DeepSeek-V2. It has demonstrated superior performance in technical tasks, including programming and mathematical problem-solving, outperforming models like Meta's Llama 3.1, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 in certain benchmarks. DeepSeek claims to have trained the model using significantly less compute power than similar models, highlighting potential limitations of US sanctions on AI hardware in China. The model is available through Hugging Face and DeepSeek's official website, with an API offered at promotional pricing until February 8, 2025.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Other • 25%
Baidu • 25%
Tencent • 25%
Alibaba • 25%
Other • 25%
Alibaba • 25%
Meta • 25%
Microsoft • 25%
Amazon • 25%
Google • 25%
Microsoft • 25%
Other • 25%
Amazon • 25%
Microsoft • 25%
Other • 25%
Google • 25%
No • 50%
Yes • 50%
10 million • 25%
Other • 25%
1 million • 25%
5 million • 25%
Top 10 • 25%
1st place • 25%
Outside Top 10 • 25%
Top 3 • 25%