Do you see what I see? Google's Gemini adds screen-aware AI to transform Android experience

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

Google‘s Gemini is advancing the Android user experience with innovative screen-aware AI capabilities that essentially turn the assistant into an interactive visual companion. Scheduled to roll out to Gemini Advanced subscribers later this month, these features represent a significant shift in how users interact with their devices, moving beyond simple voice commands to contextual visual understanding. This evolution positions Gemini as a more intuitive assistant that can respond to what users see rather than just what they say.

The big picture: Google is enhancing Gemini with screen-sharing functionality that allows users to ask questions about content visible on their Android devices, mirroring capabilities already available on desktop versions.

The feature enables contextual interactions, such as asking for shoe recommendations while viewing a jacket image, creating a more natural assistance experience.
These capabilities are part of Google’s Project Astra, a broader initiative to develop multimodal AI that better perceives and understands its environment.

Key features: The upcoming Gemini update focuses on two major capabilities that expand how users can leverage AI assistance across applications.

Users can share their screens with Gemini to ask questions about displayed content, whether browsing websites, viewing images, or reading documents.
Real-time video interactions enable users to engage with Gemini about their surroundings by activating the camera within the app, similar to ChatGPT‘s Voice and Vision functionality.

Practical applications: Gemini’s new capabilities will integrate with popular apps to provide contextual assistance without disrupting the user experience.

While watching YouTube videos, users can activate Gemini to ask specific questions about content, such as inquiring about exercise techniques during fitness tutorials.
When viewing PDFs, the “Ask about this PDF” option will allow users to request summaries or clarifications, streamlining research and information processing on mobile devices.

Why this matters: By enabling Gemini to interpret and respond to visual inputs, Google is fundamentally changing how AI assistants function, creating more immersive and context-aware digital experiences.

The screen-aware capabilities transform passive viewing into interactive experiences, potentially setting new benchmarks for AI assistant functionality.
As these features reach Android users, they could significantly reduce the cognitive load of information processing by allowing the AI to assist with understanding and contextualizing on-screen content.

Forget ChatGPT — Google Gemini can now see the world with live video and screen-sharing

Tom's Guide

Menu

Do you see what I see? Google’s Gemini adds screen-aware AI to transform Android experience

Recent News

ByteDance releases Seed-OSS-36B with 512K token context window

Intel’s new feature boosts AI performance by allocating more RAM to integrated graphics

Insta360’s $150 AI webcam uses gimbal tech to fix video calls

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

Do you see what I see? Google’s Gemini adds screen-aware AI to transform Android experience

Recent News

ByteDance releases Seed-OSS-36B with 512K token context window

Intel’s new feature boosts AI performance by allocating more RAM to integrated graphics

Insta360’s $150 AI webcam uses gimbal tech to fix video calls

Join the revolution

CO/AI

Resources

Join the revolution