DeepNewz Markets

Markets Stories

Search

Loading...

Browse all stories on DeepNewz

OpenAI Unveils 'o3' and 'o3-mini' Models, Exceeding Human Performance on ARC-AGI

Dec 20, 2024, 05:11 PM

OpenAI has announced its next-generation reasoning models, 'o3' and 'o3-mini', during the final day of its '12 Days of OpenAI' event. CEO Sam Altman, along with SVP of Research Mark Chen and Research Scientist Hongyu Ren, presented the new models in a live stream. The 'o3' models represent a major advancement over the previous 'o1' model, significantly enhancing AI reasoning capabilities. 'o3' achieved state-of-the-art results on several benchmarks, including scoring 87.5% on the ARC-AGI benchmark—surpassing human-level performance estimated at 85%. The model also scored 25.2% on the Frontier Math benchmark, 96.7% on the AIME, and 71.7% on the SWE-Bench verified test. OpenAI skipped the 'o2' naming due to potential trademark issues with British telecommunications firm O2. The 'o3' models are currently available for safety testing by researchers but have not been publicly released.

View original story

Markets

Loading...

Looking for markets...

Will OpenAI release 'o3' model for public use by June 30, 2025?

OpenAI•12 Days of OpenAI•Sam Altman•Mark Chen•Hongyu Ren•Frontier Math•mark•AIME•British•O2

Resolution / Starting Odds

No • 50%

Yes • 50%

Official announcement on OpenAI's website or press release

Will OpenAI resolve trademark issues and release an 'o2' model by end of 2025?

OpenAI•12 Days of OpenAI•Sam Altman•Mark Chen•Hongyu Ren•Frontier Math•mark•AIME•British•O2

Resolution / Starting Odds

No • 50%

Yes • 50%

Official announcement from OpenAI or trademark filings

Will OpenAI's 'o3' model exceed 90% on ARC-AGI benchmark by end of 2025?

OpenAI•12 Days of OpenAI•Sam Altman•Mark Chen•Hongyu Ren•Frontier Math•mark•AIME•British•O2

Resolution / Starting Odds

Yes • 50%

No • 50%

Published results in AI research journals or OpenAI's official announcements

Which company will first adopt OpenAI's 'o3' model for commercial use by end of 2025?

OpenAI•12 Days of OpenAI•Sam Altman•Mark Chen•Hongyu Ren•Frontier Math•mark•AIME•British•O2

Resolution / Starting Odds

Microsoft • 25%

Other • 25%

Amazon • 25%

Google • 25%

Official press releases from companies or OpenAI's announcements

Who will release a model surpassing 'o3' on ARC-AGI by end of 2025?

OpenAI•12 Days of OpenAI•Sam Altman•Mark Chen•Hongyu Ren•Frontier Math•mark•AIME•British•O2

Resolution / Starting Odds

Other • 25%

Google DeepMind • 25%

Anthropic • 25%

Meta AI • 25%

Published AI research benchmarks or official announcements from AI companies