Grounded SAM 2
Combines Grounding DINO with SAM 2 for text-prompted segmentation and tracking.
About
Grounded SAM 2 by IDEA Research is a pipeline that pairs text-prompted detection from Grounding DINO with the segmentation and tracking of SAM 2, so any object described in natural language can be detected, segmented, and tracked across images and video. It assembles open-world models rather than introducing new training, and supports related detectors like Florence-2 and DINO-X. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Intermediate (3/5)
- License
- Apache-2.0
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Related Tools
Simple and effective multi-object tracking using every detection box.
Monocular depth estimation model producing detailed depth maps from single images.
End-to-end object detection with transformers by Meta, eliminating hand-designed components.
Self-supervised vision transformer by Meta producing universal visual features.
Unified vision foundation model by Microsoft for captioning, detection, and segmentation.
Robust multi-object tracking combining motion and appearance cues.