Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek-V3: Chinese AI Startup Releases 671B Parameter Model, Outperforms Leading Competitors
Dec 30, 2024, 01:50 PM
DeepSeek, a Chinese AI startup, has released DeepSeek-V3, a new open-source AI model with 671 billion parameters, utilizing a Mixture-of-Experts (MoE) architecture that activates only 37 billion parameters for specific tasks. The model, trained on 14.8 trillion tokens, achieves a throughput of 60 tokens per second, which is three times faster than its predecessor, DeepSeek-V2. DeepSeek-V3 has demonstrated superior performance in technical tasks, including programming and mathematical problem-solving, outperforming leading models such as Meta's Llama 3.1, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 in various benchmarks. The model is available through Hugging Face and the company's official website, with an API offered to enterprises at promotional pricing until February 8, 2025. DeepSeek's optimizations have allowed it to train the model with 11 times less compute power than similar efforts, highlighting potential limits of US sanctions on AI hardware availability in China.
View original story
Other • 25%
Alibaba • 25%
Tencent • 25%
Baidu • 25%
Other • 25%
Google • 25%
Microsoft • 25%
Amazon • 25%
Microsoft • 25%
Other • 25%
Amazon • 25%
Google • 25%
Google • 25%
Microsoft • 25%
Amazon • 25%
Other • 25%
No • 50%
Yes • 50%
Other • 25%
Finance • 25%
Healthcare • 25%
Technology • 25%
Two • 25%
More than three • 25%
Three • 25%
One • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
DeepSeek-V3 • 25%
Alibaba's Qwen 2.5 • 25%
OpenAI's GPT-4o • 25%
Meta's Llama 3.1 • 25%