Loading...
Loading...
Browse all stories on DeepNewz
VisitWill o3-mini achieve 2800+ Codeforces rating by end of 2025?
Yes • 50%
No • 50%
Codeforces rating results published by OpenAI or Codeforces platform
OpenAI Unveils o3 and o3-mini, Surpasses Human-Level ARC-AGI Performance, Sets New AI Benchmarks
Dec 20, 2024, 06:23 PM
OpenAI has unveiled 'o3' and 'o3-mini', its next-generation reasoning models designed to enhance AI's ability to adapt to novel tasks using a 'private chain of thought' and self fact-checking features. Announced on December 20, 2024, o3 surpasses previous models in various benchmarks, achieving a score of 87.5% on the ARC-AGI Semi-Private Evaluation in high-compute mode, surpassing the human performance threshold of 85%, with high-compute tasks costing $3,500 each. In low-compute mode, o3 scored 75.7% on the same evaluation. The model also set new records on other technical benchmarks, including 71.7% on the SWE-Bench Verified test, a 2727 rating on Codeforces, 96.7% on the American Invitational Mathematics Examination (AIME), and 96.7% on the GPQA-Diamond benchmark. It achieved 25.2% on EpochAI's Frontier Math problems, a significant jump from the previous best of 2%. François Chollet, a prominent AI researcher, stated that o3 represents "not merely incremental improvement, but a genuine breakthrough" in AI's ability to adapt to novel tasks. OpenAI plans to release o3-mini to the public by the end of January 2025, with o3 following shortly after. The development of o3 skips over 'o2' due to potential trademark issues with British telecommunications firm O2.
View original story
Top 100 • 25%
Top 200 • 25%
Top 300 • 25%
Below Top 300 • 25%
Below 70% • 25%
70% to 75% • 25%
75% to 80% • 25%
Above 80% • 25%
Microsoft • 25%
Other • 25%
Amazon • 25%
Google • 25%
Neutral • 25%
Moderately positive • 25%
Highly positive • 25%
Negative • 25%