Loading...
Loading...
Browse all stories on DeepNewz
VisitResearchers Develop Scalable MatMul-Free Language Model with 61% Memory Reduction
Jun 6, 2024, 12:12 PM
Researchers have developed a scalable, MatMul-free language model that eliminates the need for matrix multiplication operations while maintaining strong performance at billion-parameter scales. This new approach, which replaces MatMul operations with addition and negation, has shown to reduce memory usage by up to 61% and improve GPU efficiency. The model processes billion-parameter scale models at 13W beyond human-readable throughput, moving large language models (LLMs) closer to brain-like efficiency. The implementation has been a collaborative effort involving researchers W Guo, J Long, Y Zeng, and Z Liu from Princeton University, Stevens Institute of Technology, and the University of Pennsylvania.
View original story
Markets
Yes • 50%
No • 50%
Public announcements by major tech companies or press releases
No • 50%
Yes • 50%
Major scientific journals or conference proceedings
No • 50%
Yes • 50%
Industry reports or surveys conducted by AI research organizations
Memory reduction • 33%
Performance at scale • 33%
GPU efficiency • 33%
Public statements or press releases by companies adopting the model
University of Pennsylvania • 33%
Stevens Institute of Technology • 33%
Princeton University • 33%
Databases of academic publications such as Google Scholar or arXiv
Amazon • 25%
Google • 25%
Microsoft • 25%
Meta • 25%
Public announcements by the respective tech companies