DeepNewz Markets

Market

Which AI model will have the highest score on the MATH dataset by the end of 2025?

OpenAI•Strawberry

Resolution / Starting Odds

OpenAI's model • 25%

Deepseek-Math (7B) • 25%

Gemini 1.5 Pro (May) • 25%

GPT-4o • 25%

Official publications or announcements from AI research organizations

Story

OpenAI AI Model Scores Over 90% on MATH Dataset, Available by 2026

Jul 15, 2024, 04:57 PM

OpenAI has reportedly tested an AI model internally that scored over 90% on a MATH dataset, which is a benchmark of championship math problems. This development suggests significant progress in AI capabilities, potentially linked to the 'Strawberry' project. Other AI models, such as Deepseek-Math (7B), Gemini 1.5 Pro (May), and GPT-4o, scored 51.7%, 67.7%, and 76.6% respectively. The high score achieved by OpenAI's model indicates a breakthrough in AI performance, though some experts caution about overfitting on such datasets. The model is predicted to be available by 2026 and may involve the Q* algorithm.

View original story

Similar markets

Which AI model will achieve the highest score in the MATH benchmark by end of 2024?

DeepSeek-R1-Lite-Preview • 25%

OpenAI's o1-preview • 25%

Google DeepMind's model • 25%

Other • 25%

Which AI model will achieve highest accuracy in mathematics tasks by end of 2025?

Phi-4 • 25%

Gemini Pro • 25%

Llama 3.3 • 25%

Other • 25%

Which AI model will be rated highest in performance by June 30, 2025?

Google's Gemini • 25%

OpenAI's GPT • 25%

Microsoft's Azure AI • 25%

Other • 25%

Which AI model will achieve the highest performance benchmark by December 31, 2024?

Meta's Llama 3.1-70B • 25%

OpenAI's GPT-4 • 25%

Google's Bard • 25%

Other • 25%

Which AI model will achieve highest score in MMLU Social Sciences benchmark by end of 2024?

Llama 3.1 405B • 25%

GPT-4o • 25%

Claude Sonnet 3.5 • 25%

Other • 25%

$Which AI model will outperform Phi-4 in math benchmarks by end of 2025?$

Which AI model will outperform Phi-4 in math benchmarks by end of 2025?

GPT-5 • 25%

Claude 4 • 25%

Llama 4 • 25%

Other • 25%

$What score will OpenAI's Strawberry model achieve on the MATH dataset by end of 2024?$

What score will OpenAI's Strawberry model achieve on the MATH dataset by end of 2024?

Less than 90% • 25%

90%-94.99% • 25%

95%-99.99% • 25%

100% • 25%

Which AI model will lead in benchmarks by end of 2025?

ChatGPT-4o • 25%

Google's Gemini • 25%

Another AI model • 25%

No clear leader • 25%

Which AI model will be rated as the best performer by December 31, 2024?

OpenAI's O1 model • 25%

GPT-4 • 25%

Gemini • 25%

Anthropic's Claude • 25%

Which AI model will be top-performing in benchmarks by end of 2024?

Llama 3.1 405B • 25%

GPT-4o • 25%

Claude Sonnet 3.5 • 25%

Other • 25%

Which AI model will have the best performance in public benchmarks by end of 2024?

Claude 3.5 Sonnet • 33%

GPT-4o • 33%

Google's AI Model • 33%

$Will OpenAI's Strawberry model achieve a score of 95% or higher on the MATH dataset by end of 2024?$

Will OpenAI's Strawberry model achieve a score of 95% or higher on the MATH dataset by end of 2024?

Yes • 50%

No • 50%

Market

Story

Similar markets

Which AI model will achieve the highest score in the MATH benchmark by end of 2024?

Which AI model will achieve highest accuracy in mathematics tasks by end of 2025?

Which AI model will be rated highest in performance by June 30, 2025?

Which AI model will achieve the highest performance benchmark by December 31, 2024?

Which AI model will achieve highest score in MMLU Social Sciences benchmark by end of 2024?

Which AI model will outperform Phi-4 in math benchmarks by end of 2025?

What score will OpenAI's Strawberry model achieve on the MATH dataset by end of 2024?

Which AI model will lead in benchmarks by end of 2025?

Which AI model will be rated as the best performer by December 31, 2024?

Which AI model will be top-performing in benchmarks by end of 2024?

Which AI model will have the best performance in public benchmarks by end of 2024?

Will OpenAI's Strawberry model achieve a score of 95% or higher on the MATH dataset by end of 2024?

Will OpenAI's AI model scoring over 90% on MATH dataset be available by 2026?

Will OpenAI's AI model scoring over 90% on MATH dataset be integrated into a widely used educational platform by 2026?

Will OpenAI's AI model scoring over 90% on MATH dataset be used in a major international math competition by 2026?

What will be the primary algorithm used in OpenAI's new AI model scoring over 90% on the MATH dataset?

Will OpenAI's AI model scoring over 90% on MATH dataset be available by 2026?

Will OpenAI's AI model scoring over 90% on MATH dataset be integrated into a widely used educational platform by 2026?

Will OpenAI's AI model scoring over 90% on MATH dataset be used in a major international math competition by 2026?

What will be the primary algorithm used in OpenAI's new AI model scoring over 90% on the MATH dataset?