DeepNewz Markets

Markets Stories

Search

Loading...

Browse all stories on DeepNewz

Market

Which AI model will be the best at coding tasks on December 31, 2024?

2

OpenAI•LiveBench AI•Anthropic

Resolution / Starting Odds

OpenAI o1-preview • 25%

Anthropic Claude 3.5 Sonnet • 25%

OpenAI o1 mini • 25%

Other • 25%

LiveBench AI coding task performance metrics

Story

OpenAI o1-Preview Takes First Place Overall on LiveBench AI, Surpasses Claude 3.5 Sonnet

Sep 13, 2024, 02:03 PM

OpenAI's new o1-preview model has claimed the top spot on LiveBench AI, surpassing Anthropic's Claude 3.5 Sonnet, which had led for 85 days. The o1-preview model excels in reasoning, mathematics, data analysis, and language, although it is not perfect in every area. Notably, OpenAI's o1 mini model is reported to be even better at reasoning than the o1-preview. Despite these advancements, Claude 3.5 Sonnet still outperforms the o1-preview in coding tasks. OpenAI o1-preview is now first place overall on LiveBench AI.

View original story

Similar markets

Which AI model will have the highest Elo rating in coding by end of 2024?

ChatGPT-4o • 25%

Gemini 1.5 Pro • 25%

Claude-3.5 • 25%

Other • 25%

Which AI model will be rated as the best performer by December 31, 2024?

OpenAI's O1 model • 25%

GPT-4 • 25%

Gemini • 25%

Anthropic's Claude • 25%

Which AI model will achieve the highest performance benchmark by December 31, 2024?

Meta's Llama 3.1-70B • 25%

OpenAI's GPT-4 • 25%

Google's Bard • 25%

Other • 25%

Will AI agents based on OpenAI's o1 model outperform human developers in coding competitions by June 30, 2025?

Yes • 50%

No • 50%

Which AI model will be the best performing in 2024 benchmarks?

Claude 3.5 Sonnet • 33%

GPT-4o • 33%

Gemini • 34%

Top AI model on CodeContests by end of 2024?

Llama 3.1 8B • 25%

GPT-4 • 25%

Llama 3.1 70B • 25%

Other • 25%

Which AI model will be top-performing in benchmarks by end of 2024?

Llama 3.1 405B • 25%

GPT-4o • 25%

Claude Sonnet 3.5 • 25%

Other • 25%

Which AI model will be rated highest in performance by June 30, 2025?

Google's Gemini • 25%

OpenAI's GPT • 25%

Microsoft's Azure AI • 25%

Other • 25%

Which AI model will have the best performance in public benchmarks by end of 2024?

Claude 3.5 Sonnet • 33%

GPT-4o • 33%

Google's AI Model • 33%

Which AI model will have the highest satisfaction rating on GitHub Copilot by June 2025?

Claude 3.5 Sonnet • 25%

Gemini 1.5 Pro • 25%

o1-preview • 25%

Other • 25%

Which open-source AI model will have the highest MMLU score by December 31, 2024?

Apple's 7B AI model • 25%

Mistral 7B • 25%

Llama 3 8B • 25%

Google's Gemma • 25%

Which AI model will be the most popular in GitHub Models by December 31, 2024?

Azure OpenAI • 25%

OpenAI • 25%

Meta • 25%

MistralAI • 25%

Markets based on same story

Loading...

Looking for markets...

Show all

Will OpenAI release a new model that surpasses o1-preview in overall performance by end of 2024?

No • 50%

Yes • 50%

Will OpenAI's o1-preview model maintain the top spot on LiveBench AI by end of 2024?

Yes • 50%

No • 50%

Will OpenAI's o1-preview model surpass Claude 3.5 Sonnet in coding tasks by end of 2024?

No • 50%

Yes • 50%

Which AI model will be ranked first on LiveBench AI on December 31, 2024?

OpenAI o1-preview • 25%

Other • 25%

OpenAI o1 mini • 25%

Anthropic Claude 3.5 Sonnet • 25%