Advanced open-set object detection suite with Pro and Edge variants. Pro scales up architecture with enhanced vision backbone and 20M+ training images, achieving 54.3 AP on COCO and 55.7 AP on LVIS-minival zero-shot. Edge is optimized for deployment at 75.2 FPS with TensorRT.

Outputs 2

Grounding DINO 1.5

model

Grounding DINO 1.5 Pro for high-accuracy open-set detection and Grounding DINO 1.5 Edge for efficient edge deployment.

Variants

Name Parameters Notes
Grounding DINO 1.5 Pro High-performance, strong generalization, 54.3 AP on COCO
Grounding DINO 1.5 Edge Optimized for edge deployment, 75.2 FPS with TensorRT

Grounding DINO 1.5: Advance the Edge of Open-Set Object Detection

paper

Technical report on scaling Grounding DINO with larger architecture, enhanced backbone, and expanded training data.

arXiv: 2405.10300

visionopen-vocabulary