Loading...
Loading...
Browse all stories on DeepNewz
VisitMistral AI Releases 25.38 GB Pixtral 12B, Its First 12-Billion Parameter Multimodal Model
Sep 11, 2024, 08:15 AM
Mistral AI has released its first multimodal model, Pixtral 12B, which integrates both language and vision processing capabilities. The model, which is approximately 25.38 GB in size, features a 12-billion parameter architecture with 40 layers and a hidden dimension of 14,336. Key specifications include a text backbone based on Mistral Nemo 12B, a vision adapter with 400 million parameters, and a larger vocabulary of 131,072 tokens. The vision encoder uses GeLU and 2D RoPE, and the model introduces three new special tokens. Pixtral 12B is available via torrent and has been uploaded to platforms like GitHub and Hugging Face. This release marks a significant advancement in multimodal AI technology.
View original story
Markets
No • 50%
Yes • 50%
Official announcements from Mistral AI
Yes • 50%
No • 50%
Press releases or official announcements from major tech companies (e.g., Google, Microsoft, Amazon)
No • 50%
Yes • 50%
Download statistics from GitHub
Integration of Language and Vision • 25%
Language Processing • 25%
Ease of Use • 25%
Vision Processing • 25%
User reviews on platforms like GitHub, Hugging Face, and tech blogs
Entertainment • 25%
Retail • 25%
Healthcare • 25%
Finance • 25%
Press releases or official announcements from companies within respective industries
Other • 33%
GitHub • 33%
Hugging Face • 33%
Torrent • 33%
Download statistics from respective platforms (GitHub, Hugging Face, Torrent)