Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich feature of NVIDIA's AI Foundry will be most popular by end of 2024?
Custom generative AI models • 25%
Synthetic data generation • 25%
Fine-tuning • 25%
NeMo Retriever microservices • 25%
Usage statistics or surveys from NVIDIA or third-party market research firms
Groq Inc. and NVIDIA Turbocharge Llama 3.1 405B Model for Record-Breaking Speeds and Cost Efficiency
Jul 23, 2024, 03:18 PM
Groq Inc. has turbocharged the Llama 3.1 model, achieving record-breaking speeds and cost efficiency. The Llama 3.1 405B model, hosted by Groq Inc., runs at speeds up to 330 tokens per second, making it 100 times faster than previous models. This advancement is expected to significantly reduce costs, with some estimates suggesting it could be 10 times cheaper. The model is also available for download on Hugging Face. Additionally, Groq Inc. has partnered with Together Inference and Fine-tuning to bring these models to a broader audience, with speeds of up to 400 tokens per second for the Llama 3.1 8B model. NVIDIA has also announced its AI Foundry service, which will allow enterprises and nations to build custom generative AI models using Llama 3.1 405B and NVIDIA Nemotron models, with comprehensive features including synthetic data generation and fine-tuning. The Llama 3.1 70B model with 128k context is also part of this offering, and NVIDIA NeMo Retriever microservices are included for accurate responses.
View original story
Voice Modification • 25%
Music Generation • 25%
Sound Transformation • 25%
Other • 25%
Llama 3.1 8B • 25%
Llama 3.1 70B • 25%
Llama 3.1 405B • 25%
Other • 25%
Digital humans • 25%
PDF data extraction • 25%
Virtual screenings • 25%
Other • 25%
Increased Parameters • 25%
New Audio Inputs • 25%
Enhanced Output Quality • 25%
Other • 25%
Music Industry • 25%
Film Industry • 25%
Gaming Industry • 25%
Other • 25%
Music • 25%
Gaming • 25%
Film • 25%
Advertising • 25%
Music Production • 25%
Advertising • 25%
Gaming • 25%
Other • 25%
Machine Learning • 25%
Natural Language Processing • 25%
Computer Vision • 25%
Other • 25%
Manufacturing • 25%
Healthcare • 25%
Transportation • 25%
Retail • 25%
Faster Inference Speed • 25%
Cost Reduction • 25%
Security in Private Clouds • 25%
Ease of Integration • 25%
Flux AI • 25%
Midjourney • 25%
ChatGPT • 25%
Other • 25%
Image Recognition • 25%
Natural Language Processing • 25%
Predictive Analytics • 25%
Automated Editing • 25%
No • 50%
Yes • 50%
Yes • 50%
No • 50%
Groq Inc. • 25%
Other • 25%
NVIDIA • 25%
Together Inference • 25%