Story

Sep 17, 2024, 06:00 PM

Mistral AI Launches Open-Source Pixtral 12B Multimodal Text-Image Model

18 posts•SoftwareTech

Mistral AI has launched Pixtral 12B, its first open-source multimodal model designed for both text and image processing. The model, which features a 12-billion parameter architecture, serves as a drop-in replacement for Mistral Nemo 12B and includes a new 400M vision encoder. Pixtral 12B supports variable image sizes, multi-image input, and maintains strong performance on both text-only and multimodal benchmarks. The model uses GeLU and 2D RoPE, has a vocabulary of over 131,000 tokens, and supports image sizes up to 1024x1024 pixels with a patch size of 16x16 pixels. Additionally, Mistral AI has made Pixtral 12B available on platforms such as Le Chat and la Plateforme, introduced a free tier, and reduced prices across all its models to boost accessibility. The model is released under Apache 2.0 and includes a 400M Vision Adapter.

Related Market Proposals

No related markets found

Request our AI to generate relevant prediction markets for this story

Related Polymarket Markets

No similar markets found

This could indicate a unique market opportunity that hasn't been explored yet.