Google is on the verge of enhancing Gemini’s image understanding capabilities, with new tools that will allow users to draw directly on images before submitting them. This innovative feature, discovered in an unreleased version of the Google app, will enable users to highlight specific areas of an image. Gemini can then leverage these visual cues to provide much more accurate and relevant responses. The functionality is also expected to integrate with Google’s ‘Nano Banana’ tool for advanced image editing.
Gemini’s Image Markup Tools: Precision for Your Visual Queries
An in-depth analysis of a forthcoming Google app version (16.42.61.sa.arm64), conducted by Android Authority, has uncovered Google’s ongoing development of image markup tools for Gemini. These forthcoming tools will empower users to interact with images by drawing directly on them, whether they are chosen from a device’s gallery or freshly captured with the camera, before being uploaded to Gemini.
By highlighting a particular area of an image with these new markup features, users can direct Gemini’s attention precisely. This targeted approach is anticipated to significantly improve the relevance and accuracy of the AI’s responses to complex visual queries.
This development follows recent updates that introduced AI image editing capabilities directly within Google Lens and AI Mode.
(Image: Screenshots shared by Android Authority illustrate the upcoming feature.)
The shared screenshots indicate that Gemini’s AI will intuitively grasp the user’s focus, even if instructions are not overtly detailed, merely by observing the highlighted portions. Furthermore, users will have the flexibility to issue specific commands pertaining to the marked sections. The integration with Nano Banana’s editing suite means users can also clean up screenshots by removing irrelevant objects before sending them to Gemini.
These advancements align with other recent updates, such as Google’s introduction of new features to Video Overviews, designed to enhance learning with AI.
It’s crucial to remember that features found during development are not guaranteed to appear in public releases. Typically, such innovations are first rolled out to beta testers, with a wider release following later, if at all.
This image markup tool is just one of many Gemini features currently under wraps. Recent reports from other APK teardowns suggest that users might soon be able to deliver extended voice commands to Gemini by simply long-pressing the microphone icon in the input box, bypassing the need for “Gemini Live.”
Further interface enhancements for Gemini are also hinted at in these development builds. Coinciding with these findings, Google recently expanded its ‘Summarise Pages’ functionality to both stable and beta versions of Chrome for Android, allowing users to get quick summaries of web content.
(A related video demonstrating Gemini’s capabilities or other AI features would typically be embedded here.)