Loading...
Loading...
Browse all stories on DeepNewz
VisitFrench AI Lab Kyutai Unveils Moshi: A Real-Time Multimodal Model with 160ms Latency
Jul 3, 2024, 03:55 PM
Kyutai Labs, a French AI lab with $300M in funding, has unveiled Moshi, a groundbreaking real-time multimodal model capable of listening and speaking. This model, which runs on Apple laptops and consumer GPUs, boasts a low latency of 160ms and can function without internet access, ensuring no data leaves the user's computer. Developed by an eight-person team over six months, Moshi is set to be open-sourced, allowing for further enhancements. It features a 7B Multimodal LM with 2 channel I/O and a Real-Time Factor of 2. Designed for smooth, natural, and expressive communication, Moshi can serve as a coach, companion, or roleplay character. The model's performance, though slightly robotic, is remarkably fast and efficient, making it a strong competitor to OpenAI's GPT-4o.
View original story
Yes • 50%
No • 50%
Better than GPT-4o • 25%
Equal to GPT-4o • 25%
Worse than GPT-4o • 25%
Inconclusive • 25%
Real-time conversation • 25%
Role-playing • 25%
Providing explanations • 25%
Other • 25%
No • 50%
Yes • 50%
$200M - $300M • 25%
More than $300M • 25%
Less than $100M • 25%
$100M - $200M • 25%
Education • 25%
Other • 25%
Entertainment • 25%
Healthcare • 25%