DeepNewz Markets

Markets Stories

Search

Loading...

Browse all stories on DeepNewz

Market

Which AI model will have the shortest time-to-first-token by June 30, 2025?

2

Cerebras Systems•Meta•Llama•OpenAI•GPT•Anthropic•Cerebras•Wafer Scale Engine•WSE

Resolution / Starting Odds

Cerebras Llama 3.1 • 25%

OpenAI GPT-4o • 25%

Anthropic Claude 3.5 • 25%

Other • 25%

Performance metrics published by credible tech research firms or industry reports

Story

Cerebras Achieves 969 Tokens/Second with Meta's Llama 3.1 405B, 12x Faster than GPT-4o, 240ms Time-to-First-Token

Nov 19, 2024, 06:44 AM

Cerebras Systems has achieved a significant milestone in AI performance with its deployment of Meta's Llama 3.1 405B model. The system operates at an impressive output of 969 tokens per second, making it 12 times faster than OpenAI's GPT-4o and 18 times faster than Anthropic's Claude 3.5 Sonnet. Additionally, it boasts a time-to-first-token of just 240 milliseconds and supports a context length of 128,000 tokens with 16-bit weights. This performance enhancement is attributed to Cerebras's advanced Wafer Scale Engine (WSE), which has redefined capabilities in AI and scientific computing. The company plans to introduce a public inference endpoint soon, further expanding access to this high-speed AI technology.

View original story

Similar markets

Which AI model will be rated highest in performance by June 30, 2025?

Google's Gemini • 25%

OpenAI's GPT • 25%

Microsoft's Azure AI • 25%

Other • 25%

Which AI model will have the highest market adoption by June 30, 2025?

OpenAI's O1 model • 25%

GPT-4 • 25%

Gemini • 25%

Anthropic's Claude • 25%

Which AI model will achieve the highest performance benchmark by December 31, 2024?

Meta's Llama 3.1-70B • 25%

OpenAI's GPT-4 • 25%

Google's Bard • 25%

Other • 25%

First company to release a new AI language model after August 2024?

OpenAI • 25%

Google • 25%

Microsoft • 25%

Other • 25%

Which AI model will be rated as the best performer by December 31, 2024?

OpenAI's O1 model • 25%

GPT-4 • 25%

Gemini • 25%

Anthropic's Claude • 25%

Which quantized AI model will see the highest adoption by December 31, 2024?

Meta's Llama 3.1-70B • 25%

OpenAI's GPT-4 quantized • 25%

Google's Bard quantized • 25%

Other • 25%

Which AI model will be the leading NLP model in 2025?

Google's Gemini AI • 25%

OpenAI's GPT-4 • 25%

Anthropic's Claude-3 • 25%

Other • 25%

Which benchmark will LiquidAI's models achieve SOTA performance in by June 30, 2024?

MMLU • 25%

ARC • 25%

GSM8K • 25%

None by June 30, 2024 • 25%

Which model will achieve the highest vision-language performance by December 31, 2024?

GPT-4o • 25%

InternVL 2 • 25%

NVLM 1.0 • 25%

Other • 25%

Which AI model will be most used in commercial applications by March 31, 2025?

Meta (Llama 3) • 25%

OpenAI (GPT-4o) • 25%

Anthropic (Claude 3.5 Sonnet) • 25%

Other • 25%

Which speech recognition model will rank highest in performance benchmarks by end of 2024?

aiOla's Whisper-Medusa • 25%

OpenAI's Whisper • 25%

Google's Speech-to-Text • 25%

Other • 25%

Top AI model in community votes by mid-2025?

ChatGPT-4o • 25%

Google's Gemini • 25%

Another AI model • 25%

No clear winner • 25%

Markets based on same story

Loading...

Looking for markets...

Show all

Will another company surpass Cerebras' AI performance with Llama 3.1 by June 30, 2025?

No • 50%

Yes • 50%

Will Cerebras announce an AI model with context length over 128,000 tokens by September 30, 2025?

No • 50%

Yes • 50%

Will Cerebras launch Llama 3.1 inference endpoint by March 31, 2025?

Yes • 50%

No • 50%

Which company will announce the largest AI model by parameters by December 31, 2025?

Other • 25%

OpenAI • 25%

Cerebras Systems • 25%

Meta • 25%