Loading...
Loading...
Browse all stories on DeepNewz
VisitWill Pixtral 12B be integrated into a major commercial application by a leading tech company by end of 2024?
Yes • 50%
No • 50%
Press releases or official announcements from major tech companies (e.g., Google, Microsoft, Amazon)
Mistral AI Releases 25.38 GB Pixtral 12B, Its First 12-Billion Parameter Multimodal Model
Sep 11, 2024, 08:15 AM
Mistral AI has released its first multimodal model, Pixtral 12B, which integrates both language and vision processing capabilities. The model, which is approximately 25.38 GB in size, features a 12-billion parameter architecture with 40 layers and a hidden dimension of 14,336. Key specifications include a text backbone based on Mistral Nemo 12B, a vision adapter with 400 million parameters, and a larger vocabulary of 131,072 tokens. The vision encoder uses GeLU and 2D RoPE, and the model introduces three new special tokens. Pixtral 12B is available via torrent and has been uploaded to platforms like GitHub and Hugging Face. This release marks a significant advancement in multimodal AI technology.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes, at CES 2025 • 25%
Yes, at Google I/O 2025 • 25%
Yes, at Microsoft Build 2025 • 25%
No • 25%
Yes • 50%
No • 50%
Top 1 in a benchmark • 25%
Top 5 in a benchmark • 25%
Top 10 in a benchmark • 25%
No significant milestone • 25%
Yes, in a research paper • 25%
Yes, in a project • 25%
Yes, in both • 25%
No • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Healthcare • 25%
Finance • 25%
Retail • 25%
Other • 25%
Integration of Language and Vision • 25%
Language Processing • 25%
Ease of Use • 25%
Vision Processing • 25%
Entertainment • 25%
Retail • 25%
Healthcare • 25%
Finance • 25%