Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI model will outperform Phi-4 in math benchmarks by end of 2025?
GPT-5 • 25%
Claude 4 • 25%
Llama 4 • 25%
Other • 25%
Benchmark reports and academic publications
Microsoft's Phi-4 Model, With 14 Billion Parameters, Outperforms Larger AI Models in Math
Dec 13, 2024, 04:36 AM
Microsoft Corp. has unveiled Phi-4, a new small language model with 14 billion parameters, developed by Microsoft Research. This model specializes in complex reasoning and math, demonstrating exceptional performance in various benchmarks. Phi-4 outperforms larger models like GPT-4, Claude 3.5, Llama 3.3, and Gemini Pro 1.5 in competition math and on math benchmarks, achieving a 91.8% accuracy on AMC 10/12 math competition problems. The model's development focuses on synthetic data quality and innovative training techniques, rather than increasing model size, highlighting a shift from the traditional 'scale-first' mindset in AI. Phi-4 scored 56.1 on GPQA, 80.4 on MATH, and 82.6 on HumanEval benchmarks. It is currently available on Azure AI Foundry and will soon be accessible on Hugging Face.
View original story
DeepSeek-R1-Lite-Preview • 25%
OpenAI's o1-preview • 25%
Google DeepMind's model • 25%
Other • 25%
Phi-4 • 25%
Gemini Pro • 25%
Llama 3.3 • 25%
Other • 25%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
OpenAI's model • 25%
Deepseek-Math (7B) • 25%
Gemini 1.5 Pro (May) • 25%
GPT-4o • 25%
ChatGPT-4o • 25%
Google's Gemini • 25%
Another AI model • 25%
No clear leader • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Google's AI Model • 33%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 • 25%
Google's Bard • 25%
Other • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Gemini • 34%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
NeurIPS • 25%
Other • 25%
AAAI • 25%
ICML • 25%