DETR
End-to-end object detection with transformers by Meta, eliminating hand-designed components.
About
DETR, the Detection Transformer from Meta, reframes object detection as a direct set-prediction problem solved by a transformer encoder-decoder, removing hand-designed components like anchor boxes and non-maximum suppression. A bipartite matching loss forces unique predictions, and the approach matches a Faster R-CNN ResNet-50 baseline on COCO with simpler inference code. Pretrained models are provided. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Advanced (4/5)
- License
- Apache-2.0
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Related Tools
Simple and effective multi-object tracking using every detection box.
Monocular depth estimation model producing detailed depth maps from single images.
Self-supervised vision transformer by Meta producing universal visual features.
Unified vision foundation model by Microsoft for captioning, detection, and segmentation.
Monocular depth estimation model by Intel ISL supporting multiple backbones.
Robust multi-object tracking combining motion and appearance cues.