Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be the primary use case for OpenAI's 'o3-mini' by July 2025?
Education • 25%
Healthcare • 25%
Finance • 25%
Other • 25%
Industry reports, OpenAI's announcements, and tech news articles
OpenAI's 'o3' Surpasses Human Performance; 'o3-mini' Launching January 2025
Dec 20, 2024, 06:42 PM
OpenAI has announced 'o3' and 'o3-mini', their next-generation reasoning models that significantly surpass previous AI models in benchmarks. The 'o3' model achieved breakthrough performance on the ARC-AGI benchmark, scoring 75.7% in low-compute mode and an impressive 87.5% in high-compute mode, exceeding the human performance threshold of 85%. It also set new records on other benchmarks, including solving 25.2% of Frontier Math problems (surpassing the previous best of 2%), scoring 96.7% on the American Invitational Mathematics Examination (AIME), and achieving 71.7% on SWE-Bench verified. The model achieved a Codeforces rating of 2727, placing it in the top 0.05% of competitive programmers. OpenAI's 'o3' models are designed to 'think' before responding via a 'private chain of thought,' representing a significant leap in AI's ability to adapt to novel tasks and marking a qualitative shift in AI capabilities. The company skipped 'o2' due to potential trademark issues with telecommunications firm O2. The 'o3-mini' model is planned to be released publicly by the end of January 2025, with the full 'o3' model to follow shortly after.
View original story
Healthcare • 25%
Finance • 25%
Education • 25%
Other • 25%
Healthcare • 25%
Finance • 25%
Education • 25%
Other • 25%
Coding/Programming • 25%
Academic Research • 25%
Business Analytics • 25%
Other • 25%
Writing code • 25%
Booking travel • 25%
Customer support • 25%
Other • 25%
Customer Service • 25%
Data Analysis • 25%
Research Assistance • 25%
Other • 25%
Automated coding • 25%
Debugging and testing • 25%
Code review and optimization • 25%
Other • 25%
Software programming • 25%
STEM applications • 25%
Legal reasoning • 25%
Disease diagnosis • 25%
Startups • 25%
Education • 25%
Small to Medium Enterprises (SMEs) • 25%
Other • 25%
Deep Learning • 25%
Computer Vision • 25%
Autonomous Vehicles • 25%
Robotics • 25%
Data Classification • 25%
Chatbots • 25%
Content Generation • 25%
Other • 25%
Data Analysis • 25%
Natural Language Processing • 25%
Predictive Analytics • 25%
Other • 25%
Physics • 25%
Chemistry • 25%
Biology • 25%
Coding • 25%
Yes • 50%
No • 50%
0-1 benchmarks • 25%
More than 5 benchmarks • 25%
4-5 benchmarks • 25%
2-3 benchmarks • 25%