Loading...
Loading...
Browse all stories on DeepNewz
VisitWill Pixtral 12B surpass 1 million downloads on GitHub by March 31, 2025?
Yes • 50%
No • 50%
Download statistics from GitHub
Mistral AI Releases 25.38 GB Pixtral 12B, Its First 12-Billion Parameter Multimodal Model
Sep 11, 2024, 08:15 AM
Mistral AI has released its first multimodal model, Pixtral 12B, which integrates both language and vision processing capabilities. The model, which is approximately 25.38 GB in size, features a 12-billion parameter architecture with 40 layers and a hidden dimension of 14,336. Key specifications include a text backbone based on Mistral Nemo 12B, a vision adapter with 400 million parameters, and a larger vocabulary of 131,072 tokens. The vision encoder uses GeLU and 2D RoPE, and the model introduces three new special tokens. Pixtral 12B is available via torrent and has been uploaded to platforms like GitHub and Hugging Face. This release marks a significant advancement in multimodal AI technology.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Top 1 in a benchmark • 25%
Top 5 in a benchmark • 25%
Top 10 in a benchmark • 25%
No significant milestone • 25%
Yes, at CES 2025 • 25%
Yes, at Google I/O 2025 • 25%
Yes, at Microsoft Build 2025 • 25%
No • 25%
Yes • 50%
No • 50%
0-10 • 25%
11-20 • 25%
21-30 • 25%
31+ • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Integration of Language and Vision • 25%
Language Processing • 25%
Ease of Use • 25%
Vision Processing • 25%
Entertainment • 25%
Retail • 25%
Healthcare • 25%
Finance • 25%