DeepNewz Markets

Market

Will OpenAI's o1-preview model maintain the top spot on LiveBench AI by end of 2024?

OpenAI•LiveBench AI•Anthropic

Resolution / Starting Odds

Yes • 50%

No • 50%

LiveBench AI rankings

Story

OpenAI o1-Preview Takes First Place Overall on LiveBench AI, Surpasses Claude 3.5 Sonnet

Sep 13, 2024, 02:03 PM

OpenAI's new o1-preview model has claimed the top spot on LiveBench AI, surpassing Anthropic's Claude 3.5 Sonnet, which had led for 85 days. The o1-preview model excels in reasoning, mathematics, data analysis, and language, although it is not perfect in every area. Notably, OpenAI's o1 mini model is reported to be even better at reasoning than the o1-preview. Despite these advancements, Claude 3.5 Sonnet still outperforms the o1-preview in coding tasks. OpenAI o1-preview is now first place overall on LiveBench AI.

View original story

Similar markets

$Will OpenAI's 'o1' model surpass GPT-4o in a public benchmark by end of 2024?$

Top 5% • 25%

Below Top 10% • 25%

Market

Story

Similar markets

Will OpenAI's 'o1' model surpass GPT-4o in a public benchmark by end of 2024?

Will OpenAI's o1 model achieve a new benchmark performance in AI research by January 31, 2025?

What will be OpenAI o1 model's performance on benchmark tasks by end of 2024?

Will OpenAI's o1-preview model receive a major update by March 31, 2025?

Will DeepSeek-R1-Lite-Preview surpass OpenAI's o1-preview in AIME benchmark by end of 2024?

Will OpenAI's O1 model outperform GPT-4 in a standardized benchmark test by December 31, 2024?

Will Google's AI surpass OpenAI's o1 model in benchmark tests by March 31, 2025?

Will OpenAI's o1 model outperform Anthropic's model on PlanBench by March 31, 2025?

Will OpenAI's ChatGPT-4o maintain the top spot on the AI Chatbot Arena leaderboard by end of 2024?

Will Google announce an AI model surpassing OpenAI's o1 model in benchmarks by June 30, 2025?

Will OpenAI's o1 model achieve significant commercial success by the end of 2025?

Will OpenAI's o1-preview model achieve a user satisfaction rating of 90% or higher by June 30, 2025?

Will OpenAI release a new model that surpasses o1-preview in overall performance by end of 2024?

Will OpenAI's o1-preview model surpass Claude 3.5 Sonnet in coding tasks by end of 2024?

Which AI model will be ranked first on LiveBench AI on December 31, 2024?

Which AI model will be the best at coding tasks on December 31, 2024?

Will OpenAI release a new model that surpasses o1-preview in overall performance by end of 2024?

Will OpenAI's o1-preview model surpass Claude 3.5 Sonnet in coding tasks by end of 2024?

Which AI model will be ranked first on LiveBench AI on December 31, 2024?

Which AI model will be the best at coding tasks on December 31, 2024?