What will be the most popular input type for Gemini 2.0 by the end of 2025?
Text • 25%
Images • 25%
Video • 25%
Audio • 25%
Usage statistics and reports from Google AI Studio
Google's Gemini 2.0 Launches with Multimodal Capabilities and Integration into Search
Dec 12, 2024, 06:20 AM
Google has launched Gemini 2.0, its most advanced artificial intelligence model to date, aiming to usher in what CEO Sundar Pichai describes as a "new agentic era" in AI development. This model is designed to understand and make decisions about the world around users, enhancing the ability to grasp context, think ahead, and execute supervised actions. Key features include native image and audio generation, allowing for more comprehensive and versatile interactions. The model is twice as fast as its predecessor, with the first release, Gemini 2.0 Flash, offering low latency and enhanced performance. It supports multiple input types like text, images, video, and audio, and can output in various formats, including generated images and text-to-speech. Gemini 2.0 will be integrated into Google's ecosystem, particularly in Search and the Gemini platform, with plans for broader integration into products like Google Search, Google Lens, and Google Maps in early 2025. Developers can access the model via Google AI Studio and Vertex AI, and it promises to revolutionize content creation, data analysis, and customer service experiences. Google's sixth-generation TPU hardware, codenamed Trillium, was used exclusively for both training and running Gemini 2.0. The company has also introduced several research prototypes, including Project Astra, Project Mariner, and Jules, to explore the possibilities of AI agents.
View original story
AI Assistant (Project Astra) • 25%
AI Coding (Jules) • 25%
Gaming (Gemini 2.0 for Games) • 25%
Browser-based tasks (Project Mariner) • 25%
Real-time conversation • 25%
Image sharing • 25%
Other • 25%
Screen sharing • 25%
Other • 25%
Text Generation • 25%
Image Generation • 25%
Voice-to-Voice Interaction • 25%
Coding Assistant • 25%
Gaming Assistant • 25%
Virtual Assistant • 25%
Shopping Assistant • 25%
Enhanced multimodal capabilities • 25%
Autonomous task execution • 25%
Real-time voice interactions • 25%
Desktop information retrieval • 25%
Research • 25%
Other • 25%
Web Browsing • 25%
Gaming • 25%
Other • 25%
Android Integration • 25%
YouTube Content Management • 25%
Search Enhancements • 25%
Coding • 25%
Mathematics • 25%
Physics • 25%
Other • 25%
Education • 25%
Technology • 25%
Healthcare • 25%
Finance • 25%
Coding Assistance • 25%
Image Output • 25%
Audio Output • 25%
Agentic Features • 25%
Finance • 25%
Healthcare • 25%
Other • 25%
Retail • 25%
Automotive • 25%
Healthcare • 25%
Media and Entertainment • 25%
Other • 25%
No • 50%
Yes • 50%
Other • 25%
Content Creation • 25%
Data Analysis • 25%
Customer Service • 25%