Loading...
Loading...
Browse all stories on DeepNewz
VisitPublic reception of o3-mini by March 2025?
Highly positive • 25%
Moderately positive • 25%
Neutral • 25%
Negative • 25%
Public reviews, media articles, and user feedback on platforms such as Twitter, Reddit, and tech blogs
OpenAI Unveils o3 and o3-mini, Surpasses Human-Level ARC-AGI Performance, Sets New AI Benchmarks
Dec 20, 2024, 06:23 PM
OpenAI has unveiled 'o3' and 'o3-mini', its next-generation reasoning models designed to enhance AI's ability to adapt to novel tasks using a 'private chain of thought' and self fact-checking features. Announced on December 20, 2024, o3 surpasses previous models in various benchmarks, achieving a score of 87.5% on the ARC-AGI Semi-Private Evaluation in high-compute mode, surpassing the human performance threshold of 85%, with high-compute tasks costing $3,500 each. In low-compute mode, o3 scored 75.7% on the same evaluation. The model also set new records on other technical benchmarks, including 71.7% on the SWE-Bench Verified test, a 2727 rating on Codeforces, 96.7% on the American Invitational Mathematics Examination (AIME), and 96.7% on the GPQA-Diamond benchmark. It achieved 25.2% on EpochAI's Frontier Math problems, a significant jump from the previous best of 2%. François Chollet, a prominent AI researcher, stated that o3 represents "not merely incremental improvement, but a genuine breakthrough" in AI's ability to adapt to novel tasks. OpenAI plans to release o3-mini to the public by the end of January 2025, with o3 following shortly after. The development of o3 skips over 'o2' due to potential trademark issues with British telecommunications firm O2.
View original story
Mostly positive • 25%
Mixed • 25%
Mostly negative • 25%
Minimal coverage • 25%
Top 1-10% • 25%
Top 11-30% • 25%
Top 31-50% • 25%
Below 50% • 25%
Increased Rate Limits • 25%
New Functionalities • 25%
Enhanced Security • 25%
Other • 25%
Excellent • 25%
Good • 25%
Average • 25%
Poor • 25%
Very Positive • 25%
Positive • 25%
Neutral • 25%
Negative • 25%
Consumer • 25%
Enterprise • 25%
Both • 25%
Other • 25%
Education • 25%
Healthcare • 25%
Finance • 25%
Other • 25%
Microsoft • 25%
Other • 25%
Amazon • 25%
Google • 25%