Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI model will have the highest score on the MATH dataset by the end of 2025?
OpenAI's model • 25%
Deepseek-Math (7B) • 25%
Gemini 1.5 Pro (May) • 25%
GPT-4o • 25%
Official publications or announcements from AI research organizations
OpenAI AI Model Scores Over 90% on MATH Dataset, Available by 2026
Jul 15, 2024, 04:57 PM
OpenAI has reportedly tested an AI model internally that scored over 90% on a MATH dataset, which is a benchmark of championship math problems. This development suggests significant progress in AI capabilities, potentially linked to the 'Strawberry' project. Other AI models, such as Deepseek-Math (7B), Gemini 1.5 Pro (May), and GPT-4o, scored 51.7%, 67.7%, and 76.6% respectively. The high score achieved by OpenAI's model indicates a breakthrough in AI performance, though some experts caution about overfitting on such datasets. The model is predicted to be available by 2026 and may involve the Q* algorithm.
View original story
DeepSeek-R1-Lite-Preview • 25%
OpenAI's o1-preview • 25%
Google DeepMind's model • 25%
Other • 25%
Phi-4 • 25%
Gemini Pro • 25%
Llama 3.3 • 25%
Other • 25%
Google's Gemini • 25%
OpenAI's GPT • 25%
Microsoft's Azure AI • 25%
Other • 25%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 • 25%
Google's Bard • 25%
Other • 25%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
GPT-5 • 25%
Claude 4 • 25%
Llama 4 • 25%
Other • 25%
Less than 90% • 25%
90%-94.99% • 25%
95%-99.99% • 25%
100% • 25%
ChatGPT-4o • 25%
Google's Gemini • 25%
Another AI model • 25%
No clear leader • 25%
OpenAI's O1 model • 25%
GPT-4 • 25%
Gemini • 25%
Anthropic's Claude • 25%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Google's AI Model • 33%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Recurrent Neural Network-based • 25%
Other • 25%
Q* algorithm • 25%
Transformer-based • 25%