LocateAnything: Fast Visual Grounding AI Demo

Detect and label objects in images and videos. LocateAnything is an NVIDIA vision-language model that finds objects, text, GUI elements, and points in images with natural language prompts.

LocateAnything: Fast Visual Grounding AI
LocateAnything: Fast Visual Grounding AI