Loading...
Loading...
Browse all stories on DeepNewz
VisitWill Pixtral-12B achieve a top-3 ranking on the ImageNet leaderboard by the end of 2024?
Yes • 50%
No • 50%
ImageNet leaderboard
Mistral Releases 25.38 GB Pixtral-12B with 400M Parameters
Sep 11, 2024, 07:13 AM
Mistral has released a new multimodal model named Pixtral-12B. The model, which is a vision language model (VLM), features a 400 million parameter vision encoder. It is based on Mistral-Nemo and incorporates elements from SigLip and ViT Vision Encoder. Pixtral-12B has a size of 25.38 GB and includes specifications such as 5120 dimensions, 40 layers, 128 head dimensions, 14336 hidden dimensions, 32 heads, 8 KV heads, a rope theta of 1000000000.0, a norm epsilon of 1e-05, and a vocabulary size of 131072. The model is expected to perform well compared to other models like Lllama 3.1. A magnet link for downloading the model has also been provided.
View original story
Top 1 in a benchmark • 25%
Top 5 in a benchmark • 25%
Top 10 in a benchmark • 25%
No significant milestone • 25%
Top 1 • 25%
Top 2-5 • 25%
Top 6-10 • 25%
Outside Top 10 • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes, in a research paper • 25%
Yes, in a project • 25%
Yes, in both • 25%
No • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes, at CES 2025 • 25%
Yes, at Google I/O 2025 • 25%
Yes, at Microsoft Build 2025 • 25%
No • 25%
Yes • 50%
No • 50%
FLUX 1.1 Pro • 25%
FLUX 1.0 Pro • 25%
Competitor Model 1 • 25%
Competitor Model 2 • 25%
No • 50%
Yes • 50%
1st place • 25%
4th place or lower • 25%
3rd place • 25%
2nd place • 25%
2nd place • 25%
4th place or lower • 25%
1st place • 25%
3rd place • 25%