DeepNewz Markets

Market

Will Cerebras announce an AI model with context length over 128,000 tokens by September 30, 2025?

Cerebras Systems•Meta•Llama•OpenAI•GPT•Anthropic•Cerebras•Wafer Scale Engine•WSE

Resolution / Starting Odds

Yes • 50%

No • 50%

Official announcements from Cerebras Systems or credible tech news outlets

Story

Cerebras Achieves 969 Tokens/Second with Meta's Llama 3.1 405B, 12x Faster than GPT-4o, 240ms Time-to-First-Token

Nov 19, 2024, 06:44 AM

Cerebras Systems has achieved a significant milestone in AI performance with its deployment of Meta's Llama 3.1 405B model. The system operates at an impressive output of 969 tokens per second, making it 12 times faster than OpenAI's GPT-4o and 18 times faster than Anthropic's Claude 3.5 Sonnet. Additionally, it boasts a time-to-first-token of just 240 milliseconds and supports a context length of 128,000 tokens with 16-bit weights. This performance enhancement is attributed to Cerebras's advanced Wafer Scale Engine (WSE), which has redefined capabilities in AI and scientific computing. The company plans to introduce a public inference endpoint soon, further expanding access to this high-speed AI technology.

View original story

Market

Story

Similar markets

Will Cerebras Systems' Wafer Scale Engine become the industry leader in AI chips by December 31, 2025?

Will Sakana AI release a new large language model by March 31, 2025?

Will a major tech company adopt Cerebras Inference service by the end of 2024?

Will Hyperbolic Labs release a new AI model by June 30, 2025?

Will Broadcom announce a new AI chip product line by June 30, 2025?

Will Broadcom announce a new AI product by June 30, 2025?

Will Cerebras surpass Nvidia in AI inference speed for Llama 3.1 8B by end of 2024?

Will Google announce a model with a larger context window than Gemini-exp-1206 by end of 2025?

Will NVIDIA announce a new AI chip with significant performance improvements by December 31, 2024?

Will Nvidia launch an AI inference service outperforming Cerebras by Q1 2025?

Will Tesla's AI supercluster Cortex achieve a specific performance milestone by June 30, 2025?

Will Cerebras Systems achieve profitability by December 31, 2025?

Will another company surpass Cerebras' AI performance with Llama 3.1 by June 30, 2025?

Will Cerebras launch Llama 3.1 inference endpoint by March 31, 2025?

Which AI model will have the shortest time-to-first-token by June 30, 2025?

Which company will announce the largest AI model by parameters by December 31, 2025?

Will another company surpass Cerebras' AI performance with Llama 3.1 by June 30, 2025?

Will Cerebras launch Llama 3.1 inference endpoint by March 31, 2025?

Which AI model will have the shortest time-to-first-token by June 30, 2025?

Which company will announce the largest AI model by parameters by December 31, 2025?