Tools/Computer Vision & Object Detection/DINOv2

DINOv2

Self-supervised vision transformer by Meta producing universal visual features.

Open SourceSelf HostedOffline CapableGPU Required (6GB+ VRAM)

0.0 (0)

About

DINOv2 is a family of self-supervised vision transformers from Meta AI, pretrained on 142 million curated images without any labels. The resulting features are general purpose: a frozen DINOv2 backbone with a simple linear head handles image classification, semantic segmentation, depth estimation, and retrieval without fine-tuning, reaching up to 87.1 percent linear-probe accuracy on ImageNet. Four distilled sizes are published, from ViT-S at 21M parameters through ViT-B and ViT-L up to ViT-g at 1.1B, each also available with register tokens that improve feature map quality. Pretrained classification, segmentation, and depth heads are provided for immediate use, and specialized variants exist for X-ray and cell microscopy imagery under non-commercial research licenses. The main code and weights are Apache 2.0, and the line has continued into DINOv3. Computer vision researchers and practitioners commonly adopt it as a frozen feature extractor when labeled data is scarce.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Computer Vision & Object Detection
Price: Free
Platform: Local/Desktop
Difficulty: Intermediate (3/5)
License: Apache-2.0
Minimum VRAM: 6 GB
Added: Apr 3, 2026

0.0 (0)

Website GitHub

Browse all Computer Vision & Object Detection tools

DINOv2

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

CLIP

DeepFace

Depth Anything V1

Depth Anything V2

Detectron2

ByteTrack