DeepNewz Markets

Markets Stories

Search

Loading...

Browse all stories on DeepNewz

Market

Will OpenAI's 'o3' model exceed 90% on ARC-AGI benchmark by end of 2025?

3

OpenAI•12 Days of OpenAI•Sam Altman•Mark Chen•Hongyu Ren•Frontier Math•mark•AIME•British•O2

Resolution / Starting Odds

Yes • 50%

No • 50%

Published results in AI research journals or OpenAI's official announcements

Story

OpenAI Unveils 'o3' and 'o3-mini' Models, Exceeding Human Performance on ARC-AGI

Dec 20, 2024, 05:11 PM

OpenAI has announced its next-generation reasoning models, 'o3' and 'o3-mini', during the final day of its '12 Days of OpenAI' event. CEO Sam Altman, along with SVP of Research Mark Chen and Research Scientist Hongyu Ren, presented the new models in a live stream. The 'o3' models represent a major advancement over the previous 'o1' model, significantly enhancing AI reasoning capabilities. 'o3' achieved state-of-the-art results on several benchmarks, including scoring 87.5% on the ARC-AGI benchmark—surpassing human-level performance estimated at 85%. The model also scored 25.2% on the Frontier Math benchmark, 96.7% on the AIME, and 71.7% on the SWE-Bench verified test. OpenAI skipped the 'o2' naming due to potential trademark issues with British telecommunications firm O2. The 'o3' models are currently available for safety testing by researchers but have not been publicly released.

View original story

Similar markets

Will OpenAI's o3 score 90%+ on ARC-AGI by end of 2025?

Yes • 50%

No • 50%

Will OpenAI's o3 model score 80%+ on ARC-AGI before release?

Yes • 50%

No • 50%

How will OpenAI's 'o3' perform on ARC-AGI benchmark compared to others by 2025?

o3 remains the top performer • 25%

Another model surpasses o3 • 25%

o3 ties with another model • 25%

No new models tested • 25%

Will OpenAI's 'o3' model surpass 30% on Frontier Math by the end of 2025?

Yes • 50%

No • 50%

Will OpenAI's o1 model achieve a new benchmark performance in AI research by January 31, 2025?

Yes • 50%

No • 50%

Will OpenAI's O1 model achieve a significant milestone towards AGI by June 30, 2025?

Yes • 50%

No • 50%

Will OpenAI's 'o3' model achieve a Codeforces rating of 2800+ by end of 2025?

Yes • 50%

No • 50%

What will be the accuracy improvement of OpenAI's o3 model over o1-preview by end of 2025?

Less than 5% • 25%

5% to 10% • 25%

10% to 20% • 25%

More than 20% • 25%

Will OpenAI's O1 models achieve a significant breakthrough in AGI research by the end of 2024?

Yes • 50%

No • 50%

$Will OpenAI's 'o1' model surpass GPT-4o in a public benchmark by end of 2024?$

Will OpenAI's 'o1' model surpass GPT-4o in a public benchmark by end of 2024?

Yes • 50%

No • 50%

Will OpenAI's 'o3' model reach a Codeforces rating of 2800 by March 2025?

Yes • 50%

No • 50%

How many AI benchmarks will OpenAI's 'o3' set records in by the end of 2025?

0-1 benchmarks • 25%

2-3 benchmarks • 25%

4-5 benchmarks • 25%

More than 5 benchmarks • 25%

Markets based on same story

Loading...

Looking for markets...

Show all

Will OpenAI release 'o3' model for public use by June 30, 2025?

No • 50%

Yes • 50%

Will OpenAI resolve trademark issues and release an 'o2' model by end of 2025?

No • 50%

Yes • 50%

Which company will first adopt OpenAI's 'o3' model for commercial use by end of 2025?

Microsoft • 25%

Other • 25%

Amazon • 25%

Google • 25%

Who will release a model surpassing 'o3' on ARC-AGI by end of 2025?

Other • 25%

Google DeepMind • 25%

Anthropic • 25%

Meta AI • 25%