×
Do you see what I see? Google’s Gemini adds screen-aware AI to transform Android experience
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google‘s Gemini is advancing the Android user experience with innovative screen-aware AI capabilities that essentially turn the assistant into an interactive visual companion. Scheduled to roll out to Gemini Advanced subscribers later this month, these features represent a significant shift in how users interact with their devices, moving beyond simple voice commands to contextual visual understanding. This evolution positions Gemini as a more intuitive assistant that can respond to what users see rather than just what they say.

The big picture: Google is enhancing Gemini with screen-sharing functionality that allows users to ask questions about content visible on their Android devices, mirroring capabilities already available on desktop versions.

  • The feature enables contextual interactions, such as asking for shoe recommendations while viewing a jacket image, creating a more natural assistance experience.
  • These capabilities are part of Google’s Project Astra, a broader initiative to develop multimodal AI that better perceives and understands its environment.

Key features: The upcoming Gemini update focuses on two major capabilities that expand how users can leverage AI assistance across applications.

  • Users can share their screens with Gemini to ask questions about displayed content, whether browsing websites, viewing images, or reading documents.
  • Real-time video interactions enable users to engage with Gemini about their surroundings by activating the camera within the app, similar to ChatGPT‘s Voice and Vision functionality.

Practical applications: Gemini’s new capabilities will integrate with popular apps to provide contextual assistance without disrupting the user experience.

  • While watching YouTube videos, users can activate Gemini to ask specific questions about content, such as inquiring about exercise techniques during fitness tutorials.
  • When viewing PDFs, the “Ask about this PDF” option will allow users to request summaries or clarifications, streamlining research and information processing on mobile devices.

Why this matters: By enabling Gemini to interpret and respond to visual inputs, Google is fundamentally changing how AI assistants function, creating more immersive and context-aware digital experiences.

  • The screen-aware capabilities transform passive viewing into interactive experiences, potentially setting new benchmarks for AI assistant functionality.
  • As these features reach Android users, they could significantly reduce the cognitive load of information processing by allowing the AI to assist with understanding and contextualizing on-screen content.
Forget ChatGPT — Google Gemini can now see the world with live video and screen-sharing

Recent News

Musk-backed DOGE project targets federal workforce with AI automation

DOGE recruitment effort targets 300 standardized roles affecting 70,000 federal employees, sparking debate over AI readiness for government work.

AI tools are changing workflows more than they are cutting jobs

Counterintuitively, the Danish study found that ChatGPT and similar AI tools created new job tasks for workers and saved only about three hours of labor monthly.

Disney abandons Slack after hacker steals terabytes of confidential data using fake AI tool

A Disney employee fell victim to malware disguised as an AI art tool, enabling the hacker to steal 1.1 terabytes of confidential data and forcing the company to abandon Slack entirely.