Google Lens expands search capabilities with video and voice features: Google has introduced new functionalities to its Lens app, allowing users to search using video and voice commands, enhancing the visual search experience.
- The update, rolling out in Search Labs on Android and iOS, enables users to record short videos and ask questions about what they’re seeing.
- Google’s Gemini AI model processes the video content and user queries to provide relevant responses and search results.
- The new feature builds upon Google’s existing image search capabilities, applying computer vision techniques to analyze multiple video frames in sequence.
How it works: Users can now leverage video recording and voice input to interact with Google Lens, making visual searches more dynamic and intuitive.
- To use the new feature, users open the Google Lens app, hold down the shutter button to start recording, and verbally ask a question about what they’re observing.
- The system captures the video as a series of image frames, which are then analyzed using advanced computer vision techniques.
- A custom Gemini AI model processes the visual information and user query, providing a response rooted in web-based information.
Practical applications: The video search feature opens up new possibilities for users to interact with their environment and obtain information in real-time.
- Google suggests the feature could be useful in scenarios such as visiting an aquarium, where users can ask questions about the marine life they’re observing.
- The technology allows for more contextual and detailed queries that may not be easily captured in a single image.
Voice search enhancement: In addition to video search, Google Lens has also updated its photo search feature with voice input capabilities.
- Users can now ask questions verbally while aiming their camera at a subject, eliminating the need to type queries after taking a picture.
- This feature is rolling out globally on Android and iOS but is currently only available in English.
Technical insights: Rajan Patel, Google’s vice president of engineering, provided some background on the technology powering these new features.
- The video search functionality builds upon existing image recognition techniques used in Google Lens.
- A custom Gemini model was developed specifically to understand and process multiple video frames in sequence.
- While the current implementation doesn’t support audio analysis, such as identifying bird sounds, Google is reportedly experimenting with this capability for future updates.
Broader implications: These advancements in visual search technology reflect the ongoing evolution of how users interact with and obtain information from their surroundings.
- The integration of video and voice search in Google Lens represents a significant step towards more natural and intuitive human-computer interaction.
- As AI models like Gemini continue to improve, we can expect even more sophisticated visual search capabilities in the future, potentially transforming how we access and process information in our daily lives.
Looking ahead: While these features mark a substantial improvement in visual search technology, there’s still room for growth and refinement.
- The potential addition of audio analysis to video searches could further enhance the app’s utility, especially for tasks like wildlife identification.
- As the technology evolves, we may see more seamless integration of visual, auditory, and contextual information in search queries, bringing us closer to a truly comprehensive understanding of our environment through AI-assisted tools.
Recent Stories
DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment
The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...
Oct 17, 2025Tying it all together: Credo’s purple cables power the $4B AI data center boom
Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...
Oct 17, 2025Vatican launches Latin American AI network for human development
The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...