Loading...
Loading...
Browse all stories on DeepNewz
VisitWill AdEMAMix be featured in major AI conferences by end of 2024?
Yes, at NeurIPS 2024 • 25%
Yes, at ICML 2024 • 25%
Yes, at both NeurIPS and ICML • 25%
No, it will not be featured in any major AI conferences • 25%
Conference agendas and presentations from major AI conferences like NeurIPS, ICML, etc.
Apple and EPFL Introduce AdEMAMix, a Novel AI Optimizer with 1.95x Improvement
Sep 10, 2024, 01:11 AM
Researchers from Apple and EPFL have introduced AdEMAMix, a novel optimization approach leveraging dual exponential moving averages to enhance gradient efficiency and improve large-scale model training performance. The new optimizer, which operates with just 120 lines of code, claims a 1.95x improvement over the widely-used AdamW optimizer. AdEMAMix requires 95% fewer training tokens than AdamW to reach the same level of performance. The approach utilizes two exponential moving averages for the numerator of Adam, a fast one with a low beta and a slow one with a high beta, which could explain its superior performance in various optimization scenarios, including FedOpt variants like DiLoCo.
View original story
Yes • 50%
No • 50%
NeurIPS • 25%
ICML • 25%
CVPR • 25%
Other • 25%
Yes • 50%
No • 50%
NeurIPS • 25%
ICML • 25%
AAAI • 25%
CVPR • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Grok 2 • 25%
Imagen 3 • 25%
GPT-5 • 25%
Other • 25%
Yes, in Apple's products • 25%
Yes, in other companies' products • 25%
Yes, in both Apple's and other companies' products • 25%
No, it will not be implemented in any commercial products • 25%