×
Google Lens now lets you search with videos too
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google Lens expands search capabilities with video and voice features: Google has introduced new functionalities to its Lens app, allowing users to search using video and voice commands, enhancing the visual search experience.

  • The update, rolling out in Search Labs on Android and iOS, enables users to record short videos and ask questions about what they’re seeing.
  • Google’s Gemini AI model processes the video content and user queries to provide relevant responses and search results.
  • The new feature builds upon Google’s existing image search capabilities, applying computer vision techniques to analyze multiple video frames in sequence.

How it works: Users can now leverage video recording and voice input to interact with Google Lens, making visual searches more dynamic and intuitive.

  • To use the new feature, users open the Google Lens app, hold down the shutter button to start recording, and verbally ask a question about what they’re observing.
  • The system captures the video as a series of image frames, which are then analyzed using advanced computer vision techniques.
  • A custom Gemini AI model processes the visual information and user query, providing a response rooted in web-based information.

Practical applications: The video search feature opens up new possibilities for users to interact with their environment and obtain information in real-time.

  • Google suggests the feature could be useful in scenarios such as visiting an aquarium, where users can ask questions about the marine life they’re observing.
  • The technology allows for more contextual and detailed queries that may not be easily captured in a single image.

Voice search enhancement: In addition to video search, Google Lens has also updated its photo search feature with voice input capabilities.

  • Users can now ask questions verbally while aiming their camera at a subject, eliminating the need to type queries after taking a picture.
  • This feature is rolling out globally on Android and iOS but is currently only available in English.

Technical insights: Rajan Patel, Google’s vice president of engineering, provided some background on the technology powering these new features.

  • The video search functionality builds upon existing image recognition techniques used in Google Lens.
  • A custom Gemini model was developed specifically to understand and process multiple video frames in sequence.
  • While the current implementation doesn’t support audio analysis, such as identifying bird sounds, Google is reportedly experimenting with this capability for future updates.

Broader implications: These advancements in visual search technology reflect the ongoing evolution of how users interact with and obtain information from their surroundings.

  • The integration of video and voice search in Google Lens represents a significant step towards more natural and intuitive human-computer interaction.
  • As AI models like Gemini continue to improve, we can expect even more sophisticated visual search capabilities in the future, potentially transforming how we access and process information in our daily lives.

Looking ahead: While these features mark a substantial improvement in visual search technology, there’s still room for growth and refinement.

  • The potential addition of audio analysis to video searches could further enhance the app’s utility, especially for tasks like wildlife identification.
  • As the technology evolves, we may see more seamless integration of visual, auditory, and contextual information in search queries, bringing us closer to a truly comprehensive understanding of our environment through AI-assisted tools.
Google Lens now lets you search with video

Recent News

Nvidia’s new AI agents can search and summarize huge quantities of visual data

NVIDIA's new AI Blueprint combines computer vision and generative AI to enable efficient analysis of video and image content, with potential applications across industries and smart city initiatives.

How Boulder schools balance AI innovation with student data protection

Colorado school districts embrace AI in classrooms, focusing on ethical use and data privacy while preparing students for a tech-driven future.

Microsoft Copilot Vision nears launch — here’s what we know right now

Microsoft's new AI feature can analyze on-screen content, offering contextual assistance without the need for additional searches or explanations.