Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be the primary use case for Moshi among early adopters by December 31, 2024?
Real-time conversation • 25%
Role-playing • 25%
Providing explanations • 25%
Other • 25%
Surveys and usage reports from early adopters
Kyutai Labs Unveils Moshi: Real-Time Open-Source GPT-4o Alternative
Jul 3, 2024, 08:41 PM
Kyutai Labs, a French AI startup, has unveiled Moshi, a groundbreaking real-time multimodal foundation model capable of listening, speaking, and understanding emotions. Moshi, which can run on consumer laptops and GPUs, is set to be open-sourced, offering a competitive alternative to OpenAI's GPT-4o. Developed by an 8-person team in just six months, Moshi features low latency of under 300ms, achieving 160ms latency with a Real-Time Factor of 2, and supports 70 different emotions and styles. The model's capabilities include real-time conversation, role-playing, and providing explanations. Despite some initial robotic voice quality, Moshi's fast response times and natural interaction have been well-received. The release includes the code, model, and accompanying research paper. Moshi operates with a 7B Multimodal LM and a 2 channel I/O system.
View original story
Entertainment content • 25%
Educational videos • 25%
Advertising and marketing • 25%
Other • 25%
Education • 25%
Healthcare • 25%
Entertainment • 25%
Other • 25%
Yes • 50%
No • 50%
Sending Bitcoin via email • 25%
Storing Bitcoin • 25%
Trading Bitcoin • 25%
Other • 25%
Research and analysis • 25%
Fundraising insights • 25%
Onchain metrics • 25%
Market and social data tracking • 25%
Identity Verification • 25%
Financial Transactions • 25%
Social Networking • 25%
Other • 25%
Text-to-video • 25%
Image-to-video • 25%
Video-to-video • 25%
Other • 25%
Academic Research • 25%
Enterprise Search Solutions • 25%
Personal Knowledge Management • 25%
Other • 25%
Cross-chain token exchanges • 25%
Staking • 25%
Decentralized applications (dApps) • 25%
Other • 25%
Other • 25%
Multilingual Applications • 25%
Synthetic Data Generation • 25%
Natural Language Processing • 25%
Latency < 100ms • 25%
Latency 100ms - 150ms • 25%
Latency 150ms - 200ms • 25%
Latency > 200ms • 25%
No • 50%
Yes • 50%
Better than GPT-4o • 25%
Inconclusive • 25%
Worse than GPT-4o • 25%
Equal to GPT-4o • 25%