Loading...
Loading...
Browse all stories on DeepNewz
VisitMeta's Llama 3.1 Models Offer 4X Cheaper AI Deployments, Achieve 99% Performance
Aug 14, 2024, 09:16 AM
Meta's AI division has released the Llama 3.1 models, which have been quantized to 4 bits by Neural Magic's research team. This quantization allows for 4X cheaper deployments, reducing the need from two 8x80GB nodes to one 4x80GB node. The models, including the 405B, 70B, and 8B versions, maintain approximately 100% recovery of the original performance. The HQQ Llama-3.1-70B model, in particular, achieves 99% of the base model's performance across various benchmarks, marking a significant advancement in AI model efficiency and cost-effectiveness.
View original story
Markets
No • 50%
Yes • 50%
Public announcements or press releases from major tech companies
No • 50%
Yes • 50%
Public financial reports or cost analysis from companies using the models
Yes • 50%
No • 50%
Benchmark results published by independent AI research organizations
Meta's Llama 3.1-70B • 25%
Other • 25%
Google's Bard • 25%
OpenAI's GPT-4 • 25%
Performance benchmarks published by independent AI research organizations
OpenAI's GPT-4 models • 25%
Other • 25%
Meta's Llama 3.1 models • 25%
Google's Bard models • 25%
Market analysis reports from AI industry analysts
Other • 25%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 quantized • 25%
Google's Bard quantized • 25%
Public announcements or press releases from companies adopting the models