Loading...
Loading...
Browse all stories on DeepNewz
VisitDeepSeek V3 AI Model With 685B Parameters Outperforms Claude 3.5 Sonnet in Aider Benchmark
Dec 25, 2024, 04:32 PM
DeepSeek has released its latest AI model, DeepSeek V3, which is now available for use through their API and on the HuggingFace platform. The model, with 685 billion parameters, utilizes a Mixture of Experts (MoE) architecture with 256 experts and a sigmoid routing method, selecting the top 8 experts for processing. DeepSeek V3 has shown superior performance in the Aider benchmark, surpassing Claude 3.5 Sonnet in multilingual programming tasks, achieving a success rate of 48% compared to 17% with the previous version, DeepSeek V2.5. The model is reported to be faster than its predecessor, with a 2x speedup in the API, now comparable to Sonnet. DeepSeek V3 also features an increased vocabulary size, hidden size, intermediate size, number of hidden layers, and number of attention heads compared to V2. It has achieved second place in aider's new polyglot benchmark with scores of 61.7% for o1, 48.9% for V3, and 45.3% for Sonnet.
View original story
Markets
Yes • 50%
No • 50%
Press releases or official announcements from tech companies or DeepSeek
Yes • 50%
No • 50%
Official reports or publications from DeepSeek or independent benchmark tests
No • 50%
Yes • 50%
Aider's official benchmark results published on their website
DeepSeek V3 • 25%
Other existing model • 25%
A new entrant • 25%
Claude 3.5 Sonnet • 25%
Performance reports from AI model developers or independent benchmark tests
Other existing model • 25%
DeepSeek V3 • 25%
Claude 3.5 Sonnet • 25%
A new entrant • 25%
Aider's official benchmark results published on their website
Google • 25%
Microsoft • 25%
Amazon • 25%
Other • 25%
Official acquisition announcements or press releases