Loading...
Loading...
Browse all stories on DeepNewz
VisitNvidia Unveils Open-Source NVLM 1.0, Achieves State-of-the-Art Vision-Language Results
Sep 18, 2024, 02:33 AM
Nvidia has introduced NVLM 1.0, a new family of open-source frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks. These models rival leading proprietary models such as GPT-4o and open-access models like InternVL 2. NVLM 1.0 also shows improved performance on text-only tasks following multimodal training. The code and model weights for NVLM 1.0 are now open-source. The technical report accompanying the release provides insights into architectural design, data curation, and other aspects of building cutting-edge multimodal LLMs.
View original story
Meta's Llama 3.1-70B • 25%
OpenAI's GPT-4 • 25%
Google's Bard • 25%
Other • 25%
OpenAI's O1 model • 25%
GPT-4 • 25%
Gemini • 25%
Anthropic's Claude • 25%
Google's Gemini • 25%
OpenAI's GPT • 25%
Microsoft's Azure AI • 25%
Other • 25%
Imagen 3 • 25%
DALL-E 3 • 25%
Midjourney v6 • 25%
Stable Diffusion 3 • 25%
OpenAI o1-preview • 25%
Anthropic Claude 3.5 Sonnet • 25%
OpenAI o1 mini • 25%
Other • 25%
Yes • 50%
No • 50%
Top 1 • 25%
Top 2-5 • 25%
Top 6-10 • 25%
Outside Top 10 • 25%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Google's AI Model • 33%
Claude 3.5 Sonnet • 33%
GPT-4o • 33%
Gemini • 34%
Llama 3.1 405B • 25%
GPT-4o • 25%
Claude Sonnet 3.5 • 25%
Other • 25%
Apple's 7B AI model • 25%
Mistral 7B • 25%
Llama 3 8B • 25%
Google's Gemma • 25%
Top 1 • 25%
Top 2 • 25%
Top 3 • 25%
Below Top 3 • 25%
No • 50%
Yes • 50%
10,000 to 20,000 • 25%
More than 20,000 • 25%
Less than 5,000 • 25%
5,000 to 10,000 • 25%