Grounded SAM 2

Combines Grounding DINO with SAM 2 for text-prompted segmentation and tracking.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)

About

Grounded SAM 2 by IDEA Research is a pipeline that pairs text-prompted detection from Grounding DINO with the segmentation and tracking of SAM 2, so any object described in natural language can be detected, segmented, and tracked across images and video. It assembles open-world models rather than introducing new training, and supports related detectors like Florence-2 and DINO-X. Released under the Apache 2.0 license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Intermediate (3/5)
License
Apache-2.0
Minimum VRAM
8 GB
Added
Apr 3, 2026

Related Tools

Simple and effective multi-object tracking using every detection box.

Open SourceSelf HostedOfflineGPU 4GB+
Intermediate
0.0 (0)

Monocular depth estimation model producing detailed depth maps from single images.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)

End-to-end object detection with transformers by Meta, eliminating hand-designed components.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

Self-supervised vision transformer by Meta producing universal visual features.

Open SourceSelf HostedOfflineGPU 6GB+
Intermediate
0.0 (0)

Unified vision foundation model by Microsoft for captioning, detection, and segmentation.

Open SourceSelf HostedOfflineGPU 6GB+
Intermediate
0.0 (0)

Robust multi-object tracking combining motion and appearance cues.

Open SourceSelf HostedOfflineGPU 4GB+
Intermediate
0.0 (0)
Browse all Computer Vision & Object Detection tools