Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI inference tool will be rated the fastest by the end of 2024?
Cerebras • 25%
Nvidia • 25%
Groq • 25%
Other • 25%
Benchmarking reports from credible sources like MLPerf or other industry-standard benchmarks
Cerebras Launches World's Fastest AI Inference Tool, 20x Faster Than Nvidia
Aug 27, 2024, 04:04 PM
Cerebras Systems has launched a new AI inference tool that aims to challenge Nvidia's dominance in the AI computing market. The startup claims its new service, known as Cerebras Inference, is the world's fastest AI inference service. It boasts significant performance advantages, including processing speeds of 1,850 tokens per second for Llama 3.1 8B models and 446 tokens per second for 70B models, with a rate of 450 tokens per second for some configurations. The service is priced at 60 cents per million tokens, which is a fifth of the cost offered by hyperscalers, and offers full 16-bit precision for model accuracy. Cerebras' tool is reportedly 20 times faster than Nvidia's GPUs and twice as fast as those from Groq, making it a competitive alternative for AI developers. The service leverages Cerebras' custom waferscale chips to achieve these performance metrics.
View original story
Nvidia • 25%
Cerebras • 25%
Google • 25%
Other • 25%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
Nvidia • 25%
Cerebras • 25%
Intel • 25%
Other • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Google's AI Model • 33%
ChatGPT-4o • 25%
Google's Gemini • 25%
Another AI model • 25%
No clear leader • 25%
Amazon • 25%
Nvidia • 25%
Microsoft • 25%
Alphabet • 25%
Intel Core Ultra 200V • 33%
Qualcomm Snapdragon X Elite • 33%
AMD Strix Point • 33%
Other • 1%
Grok 3.0 • 25%
OpenAI GPT-5 • 25%
Google DeepMind's latest model • 25%
Other • 25%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Gemini • 34%
Nvidia • 25%
OpenAI • 25%
Anthropic • 25%
Other • 25%
Llama 3.1 • 25%
GPT-4o • 25%
Bard • 25%
Other • 25%
No • 50%
Yes • 50%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
Groq • 25%
Cerebras • 25%
Other • 25%
Nvidia • 25%