Loading...
Loading...
Browse all stories on DeepNewz
VisitTop AI model on CodeContests by end of 2024?
Llama 3.1 8B • 25%
GPT-4 • 25%
Llama 3.1 70B • 25%
Other • 25%
Published results from CodeContests evaluations
Meta AI Advances Coding LLMs with RLEF, Llama 3.1 Outperforms GPT-4 on CodeContests
Oct 4, 2024, 01:57 PM
Meta AI has introduced a significant advancement in coding large language models (LLMs) with the development of Reinforcement Learning with Execution Feedback (RLEF). This technique integrates execution feedback at training time to enhance performance at inference time. The approach has been successfully applied to fine-tune Llama 3.1 models, with the 8B model surpassing GPT-4 on DeepMind’s CodeContests and the 70B model achieving state-of-the-art results. Additionally, the method has been validated through extensive evaluations, including on SWE-bench, demonstrating its effectiveness in improving LLMs for code generation tasks. The evaluations were conducted using a cloud-based infrastructure that speeds up evaluations by 30x.
View original story
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
ChatGPT-4o • 25%
Google's Gemini • 25%
Another AI model • 25%
No clear winner • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Google's AI Model • 33%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Gemini • 34%
ChatGPT-4o • 25%
Gemini 1.5 Pro • 25%
Claude-3.5 • 25%
Other • 25%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 • 25%
Google's Bard • 25%
Other • 25%
OpenAI o1-preview • 25%
Anthropic Claude 3.5 Sonnet • 25%
OpenAI o1 mini • 25%
Other • 25%
Grok 2 • 25%
Imagen 3 • 25%
GPT-5 • 25%
Other • 25%
Gemini 2.0 Flash Thinking • 25%
OpenAI's o1 • 25%
Microsoft's AI Model • 25%
Other • 25%
ChatGPT-4o • 25%
Google's Gemini • 25%
Another AI model • 25%
No clear leader • 25%
Google's Gemini • 25%
OpenAI's GPT • 25%
Microsoft's Azure AI • 25%
Other • 25%
OpenAI's O1 model • 25%
GPT-4 • 25%
Gemini • 25%
Anthropic's Claude • 25%
Meta AI • 25%
OpenAI • 25%
DeepMind • 25%
Other • 25%