Loading...
Loading...
Browse all stories on DeepNewz
VisitGroq Inc. Achieves Breakthrough with 30,210 Tokens/Sec and MLPerf Benchmark
May 25, 2024, 01:06 AM
Groq Inc. has achieved a significant milestone in performance engineering, showcasing their technology's ability to process data at unprecedented speeds. By implementing Groq and semantic caching, users can experience a substantial increase in the speed of generating answers to queries. The company's engineers have been working diligently to improve their stack, achieving a performance rate of 30,210 tokens per second. This advancement places Groq Inc. ahead of top GPUs in terms of data processing speed. Additionally, the new MLPerf benchmark results highlight the efficiency of 8 x H100 GPUs attached to a single VM, utilizing only a small fraction of the physical host's CPU and memory. This allows for more applications to run simultaneously on the system. Notably, Groq's LPUs have shown impressive performance, and MOEs are now twice as fast in the latest MLX, running at 60 tokens per second on an M2 Ultra.
View original story
70,000 tokens/s on Llama3 model • 25%
70% reduction in memory usage • 25%
Collaboration on AI with top 3 universities • 25%
New AI hardware release • 25%
50,000 tokens/s on Llama3 70B model • 33%
60% reduction in memory usage • 33%
Partnership with major tech company • 33%
5%-10% • 25%
10%-15% • 25%
15%-20% • 25%
Above 20% • 25%
Low adoption • 33%
Moderate adoption • 34%
High adoption • 33%
Widely adopted in media • 33%
Widely adopted in entertainment • 33%
Limited adoption • 34%
Outside Top 20 • 25%
Top 20 • 25%
Top 10 • 25%
Top 5 • 25%
Above 30% • 25%
Below 10% • 25%
10% to 20% • 25%
20% to 30% • 25%