Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI model will be the best at coding tasks on December 31, 2024?
OpenAI o1-preview • 25%
Anthropic Claude 3.5 Sonnet • 25%
OpenAI o1 mini • 25%
Other • 25%
LiveBench AI coding task performance metrics
OpenAI o1-Preview Takes First Place Overall on LiveBench AI, Surpasses Claude 3.5 Sonnet
Sep 13, 2024, 02:03 PM
OpenAI's new o1-preview model has claimed the top spot on LiveBench AI, surpassing Anthropic's Claude 3.5 Sonnet, which had led for 85 days. The o1-preview model excels in reasoning, mathematics, data analysis, and language, although it is not perfect in every area. Notably, OpenAI's o1 mini model is reported to be even better at reasoning than the o1-preview. Despite these advancements, Claude 3.5 Sonnet still outperforms the o1-preview in coding tasks. OpenAI o1-preview is now first place overall on LiveBench AI.
View original story
ChatGPT-4o • 25%
Gemini 1.5 Pro • 25%
Claude-3.5 • 25%
Other • 25%
OpenAI's O1 model • 25%
GPT-4 • 25%
Gemini • 25%
Anthropic's Claude • 25%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 • 25%
Google's Bard • 25%
Other • 25%
Yes • 50%
No • 50%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Gemini • 34%
Llama 3.1 8B • 25%
GPT-4 • 25%
Llama 3.1 70B • 25%
Other • 25%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
Google's Gemini • 25%
OpenAI's GPT • 25%
Microsoft's Azure AI • 25%
Other • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Google's AI Model • 33%
Claude 3.5 Sonnet • 25%
Gemini 1.5 Pro • 25%
o1-preview • 25%
Other • 25%
Apple's 7B AI model • 25%
Mistral 7B • 25%
Llama 3 8B • 25%
Google's Gemma • 25%
Azure OpenAI • 25%
OpenAI • 25%
Meta • 25%
MistralAI • 25%
No • 50%
Yes • 50%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
OpenAI o1-preview • 25%
Other • 25%
OpenAI o1 mini • 25%
Anthropic Claude 3.5 Sonnet • 25%