🚀 Excited to launch "Conversational Image Segmentation" for Gemini 2.5. Now you can segment any image with natural language. Think complex queries ("people throwing frisbees"), conditional logic ("workers not wearing hard hats"), and even abstract concepts ("areas with weather damage"). 🤖 If you can describe it, Gemini can segment it. Tell me what you think! cc: @xu_bibo @jalayrac @bcaine @tulseedoshi @AniBaddepudi
🚀 Excited to launch "Conversational Image Segmentation" for Gemini 2.5. Now you can segment any image with natural language. Think complex queries ("people throwing frisbees"), conditional logic ("workers not wearing hard hats"), and even abstract concepts ("areas with weather damage"). 🤖 If you can describe it, Gemini can segment it. Tell me what you think! cc: @xu_bibo @jalayrac @bcaine @tulseedoshi @AniBaddepudi
@RohanLikesAI each image tells a story 1000 words, I think this just 10x that idea...
@RohanLikesAI understand this was available in gemini flash 2.0 (experimental too) what’s new in this release ? O:
@RohanLikesAI Woah this is awesome. What’s ur favorite demo or use case?
@RohanLikesAI We have been using this feature now a while building out a shop the look feature and it is quiet impressive and make it easier to work with segmentation in a broad category space as ours. But we still see some issues and hitting some limits which might be interesting for you!
@RohanLikesAI This opens up so many possibilities, this in realtime would be a gamechanger!
Cool stuff. I don't think the model has strong spatial awareness baked into it yet (could be that it wasn't trained specifically on this type of task). For example, using this image from MS-coco it's unable to accurately detect for the right-most vehicle. However, when using CoT w/ a reasoning model (o4-mini), I'm able to get it to near perfection. Thoughts?
@RohanLikesAI Didn't work as expected for me, perhaps because of the too many masks? Tried in the playground app to get segmentation masks for "all hail damages on shingles" and it's very unusable in the masks being returned.
@RohanLikesAI @AskPerplexity What's this feature? Provide any sample link