
Google's 'banana' image model enhances Gemini AI's editing capabilities
What's the story
Google has updated its Gemini chatbot with a new artificial intelligence (AI) image model, dubbed Gemini 2.5 Flash Image. The upgrade is aimed at giving users more control over photo editing and is part of Google's strategy to compete with OpenAI's popular image tools. The update is now available for all users on the Gemini app and developers using the Gemini API, Google AI Studio, and Vertex AI platforms.
Enhanced capabilities
Gemini's new image model can make precise edits
The new AI image model in Gemini is designed to make more precise edits based on natural language requests from users. It also maintains the consistency of faces, animals, and other details, something that most rival tools struggle with. This has already caught the attention of social media users who praised an impressive AI image editor under the pseudonym "nano-banana" on crowdsourced evaluation platform LMArena.
Product insight
Google is pushing visual quality and instruction-following ability
Nicole Brichtova, a product lead on visual generation models at Google DeepMind, told TechCrunch that they are pushing visual quality and the model's ability to follow instructions with this update. She said, "This update does a much better job making edits more seamlessly, and the model's outputs are usable for whatever you want to use them for."
User focus
The image model can merge multiple references into 1 render
Brichtova said that Google's image model was specifically designed with consumer use cases in mind, like helping users visualize their home and garden projects. The model also has better "world knowledge" and can combine multiple references in a single prompt. For instance, it can merge an image of a sofa, a living room photo, and a color palette into one cohesive render.
Safety measures
Google has implemented safeguards to limit user creations
Despite the advanced capabilities of Gemini's new AI image generator, Google has implemented safeguards to limit what users can create. The company had previously faced issues with these safeguards, but Brichtova says they have now struck a better balance. "We want to give users creative control so that they can get from the models what they want," she explained, adding that "it's not like anything goes."
Industry competition
Gemini's user growth has been slow
The launch of OpenAI's GPT-4o with a native image generator in March sent ChatGPT's usage soaring. Now, Meta has announced plans to license AI image models from start-up Midjourney. Meanwhile, Black Forest Labs's FLUX AI image models continue to dominate benchmarks. Despite the stiff competition, Google's Gemini had 450 million monthly users as of July—implying its weekly user count is significantly lower than ChatGPT's over 700 million weekly users.