Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI model will next outperform DeepSeek-V3 on LiveBench by end of 2025?
Claude 4.0 • 25%
GPT-5 • 25%
Anthropic's New Model • 25%
Other • 25%
LiveBench benchmark results published on official website or press releases
DeepSeek Launches Open-Source DeepSeek-V3 Model with 671 Billion Parameters, 60 Tokens/Second Speed, and $5.6 Million Training Cost
Dec 26, 2024, 02:37 PM
DeepSeek has officially released its latest AI model, DeepSeek-V3, which features 671 billion parameters in a mixture-of-experts (MoE) architecture. This model utilizes 37 billion activated parameters and is capable of processing 60 tokens per second, making it three times faster than its predecessor, DeepSeek-V2. Notably, DeepSeek-V3 has achieved a score of 60.4 on the LiveBench benchmark, outperforming competitors such as Claude 3.5 Sonnet and GPT-4o across various tasks. The model was trained using 14.8 trillion tokens at a cost of approximately $5.6 million, significantly lower than the training costs of similar models from Western labs. DeepSeek-V3 has also demonstrated a remarkable improvement in coding capabilities, jumping from a 17% success rate in its previous version to 48% in the current version on the Aider benchmark. The model is fully open-source and available on HuggingFace, marking a notable advancement in the open-source AI landscape.
View original story
MATH-500 • 25%
MMLU-Pro • 25%
Other • 25%
MMLU • 25%
GPT-4o • 25%
Llama 3.1 405b • 25%
DeepSeek-V3 • 25%
Claude Sonnet 3.5 • 25%
Yes • 50%
No • 50%
Outside Top 10 • 25%
Top 10 • 25%
Top 3 • 25%
1st place • 25%
Top 5 • 25%
Top 1 • 25%
Top 3 • 25%
Outside Top 5 • 25%
DeepSeek-V3 • 25%
Meta's Llama 3.1 • 25%
OpenAI's GPT-4o • 25%
Alibaba's Qwen 2.5 • 25%
One • 25%
Three • 25%
Two • 25%
More than three • 25%
Natural Language Processing • 25%
Other • 25%
Data Analysis • 25%
Coding Assistance • 25%