Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI model will have the shortest time-to-first-token by June 30, 2025?
Cerebras Llama 3.1 • 25%
OpenAI GPT-4o • 25%
Anthropic Claude 3.5 • 25%
Other • 25%
Performance metrics published by credible tech research firms or industry reports
Cerebras Achieves 969 Tokens/Second with Meta's Llama 3.1 405B, 12x Faster than GPT-4o, 240ms Time-to-First-Token
Nov 19, 2024, 06:44 AM
Cerebras Systems has achieved a significant milestone in AI performance with its deployment of Meta's Llama 3.1 405B model. The system operates at an impressive output of 969 tokens per second, making it 12 times faster than OpenAI's GPT-4o and 18 times faster than Anthropic's Claude 3.5 Sonnet. Additionally, it boasts a time-to-first-token of just 240 milliseconds and supports a context length of 128,000 tokens with 16-bit weights. This performance enhancement is attributed to Cerebras's advanced Wafer Scale Engine (WSE), which has redefined capabilities in AI and scientific computing. The company plans to introduce a public inference endpoint soon, further expanding access to this high-speed AI technology.
View original story
Google's Gemini • 25%
OpenAI's GPT • 25%
Microsoft's Azure AI • 25%
Other • 25%
OpenAI's O1 model • 25%
GPT-4 • 25%
Gemini • 25%
Anthropic's Claude • 25%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 • 25%
Google's Bard • 25%
Other • 25%
OpenAI • 25%
Google • 25%
Microsoft • 25%
Other • 25%
OpenAI's O1 model • 25%
GPT-4 • 25%
Gemini • 25%
Anthropic's Claude • 25%
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 quantized • 25%
Google's Bard quantized • 25%
Other • 25%
Google's Gemini AI • 25%
OpenAI's GPT-4 • 25%
Anthropic's Claude-3 • 25%
Other • 25%
MMLU • 25%
ARC • 25%
GSM8K • 25%
None by June 30, 2024 • 25%
GPT-4o • 25%
InternVL 2 • 25%
NVLM 1.0 • 25%
Other • 25%
Meta (Llama 3) • 25%
OpenAI (GPT-4o) • 25%
Anthropic (Claude 3.5 Sonnet) • 25%
Other • 25%
aiOla's Whisper-Medusa • 25%
OpenAI's Whisper • 25%
Google's Speech-to-Text • 25%
Other • 25%
ChatGPT-4o • 25%
Google's Gemini • 25%
Another AI model • 25%
No clear winner • 25%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Other • 25%
OpenAI • 25%
Cerebras Systems • 25%
Meta • 25%