Loading...
Loading...
Browse all stories on DeepNewz
VisitNVIDIA Unveils Normalized Transformer (nGPT) for 4-20x Faster Training and Improved Stability of Large Language Models
Oct 19, 2024, 02:20 PM
NVIDIA researchers have introduced a new architecture called the Normalized Transformer (nGPT), which utilizes a hypersphere-based design to enhance the training speed and stability of large language models (LLMs). The nGPT architecture reportedly achieves training speeds that are 4 to 20 times faster than previous models. This advancement is expected to significantly improve convergence speed in transformer models, as noted by various experts in the field. The introduction of nGPT marks a notable development in the ongoing evolution of AI and machine learning technologies.
View original story
Markets
No • 50%
Yes • 50%
Research papers and public project reports from AI research organizations
No • 50%
Yes • 50%
Press releases and official announcements from tech companies
Yes • 50%
No • 50%
Industry reports and announcements from major AI companies
Energy Efficiency • 25%
Other • 25%
Training Speed • 25%
Stability • 25%
Industry analysis reports and AI research papers
CVPR 2024 • 25%
NeurIPS 2024 • 25%
Other • 25%
ICML 2024 • 25%
Conference agendas and keynote announcements
Other • 25%
Google • 25%
Microsoft • 25%
Meta • 25%
Official announcements and press releases from tech companies