Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek Launches V3 Model with 685B Parameters, 60 Tokens/Second, Outperforming Claude 3.5 and Sonnet 3.5 at $0.28/1M Output
Dec 26, 2024, 02:52 AM
DeepSeek has officially launched its V3 model, now available on Hugging Face, featuring 685 billion parameters and a mixture of experts (MoE) architecture with 256 experts and 8 active per token. The model reportedly outperforms competitors such as Claude 3.5 and Sonnet 3.5 on various benchmarks, including a notable 60.4 score on LiveBench. The V3 model is designed to be highly efficient, achieving speeds of 60 tokens per second, which is three times faster than its predecessor, V2. It has also shown substantial improvements in coding capabilities, increasing its performance from 17.8% to 48.4% on the Aider Polyglot leaderboard. Additionally, the pricing structure for DeepSeek V3 is competitive, offering rates of $0.28 per million outputs, significantly lower than those of its main competitors. The release is seen as a major advancement in the open-source AI landscape, with implications for the future of AI development and deployment.
View original story
Markets
No • 50%
Yes • 50%
LiveBench benchmark results published on official or tech analysis websites
Yes • 50%
No • 50%
Pricing data from official websites or press releases
No • 50%
Yes • 50%
Market share reports from AI industry analysts or major tech publications
DeepSeek V3 • 25%
Other • 25%
Sonnet 3.5 • 25%
Claude 3.5 • 25%
Speed tests published by AI research groups or tech publications
Claude 3.5 • 25%
Other • 25%
DeepSeek V3 • 25%
Sonnet 3.5 • 25%
Aider Polyglot leaderboard updates or similar coding capability benchmarks
Other • 25%
DeepSeek • 25%
OpenAI • 25%
Anthropic • 25%
Official announcements from AI companies or tech news outlets