DeepNewz Markets

Market

Groq announces partnership with major tech company by end of 2024?

Llama3•Groq•MatMul

Resolution / Starting Odds

Yes • 50%

No • 50%

Official press releases from Groq Inc. or the partnering company

Story

Groq Inc. Achieves 40,792 Tokens/s on Llama3 70B Model in AI Breakthrough

Jun 6, 2024, 03:57 PM

Groq Inc. has made significant advancements in AI language modeling, particularly with the Llama3 models. The company has achieved an input rate of 40,792 tokens per second on the Llama3 70B model, utilizing FP16 multiply and FP32 accumulate operations. This follows their previous milestone of 30,000 tokens per second on the Llama3 8B model. These improvements are attributed to Groq's innovative approach, which includes the elimination of MatMul operations in favor of addition and negation operations. This method has not only maintained strong performance at billion-parameter scales but also reduced memory usage by up to 61%. Groq's technology demonstrates impressive inference speed and precision, processing approximately 8,000 tokens in 0.2 seconds with lossless precision. Additionally, Groq has achieved over 1200 tps on L3 8B, operating at 13W, moving LLMs closer to brain-like efficiency.

View original story