Loading...
Loading...
Browse all stories on DeepNewz
VisitNvidia Unveils Open-Source NVLM 1.0, Achieves State-of-the-Art Vision-Language Results
Sep 18, 2024, 02:33 AM
Nvidia has introduced NVLM 1.0, a new family of open-source frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks. These models rival leading proprietary models such as GPT-4o and open-access models like InternVL 2. NVLM 1.0 also shows improved performance on text-only tasks following multimodal training. The code and model weights for NVLM 1.0 are now open-source. The technical report accompanying the release provides insights into architectural design, data curation, and other aspects of building cutting-edge multimodal LLMs.
View original story
Less than 50 • 25%
50 to 100 • 25%
101 to 200 • 25%
More than 200 • 25%
HunyuanVideo • 25%
Runway Gen-3 • 25%
Luma 1.6 • 25%
Other • 25%
1-3 • 25%
4-6 • 25%
7-9 • 25%
10 or more • 25%
Less than 50 • 25%
50 to 100 • 25%
101 to 200 • 25%
More than 200 • 25%
No • 50%
Yes • 50%
Top 20 • 25%
Top 5 • 25%
Outside Top 20 • 25%
Top 10 • 25%