Tools/Computer Vision & Object Detection/Depth Anything V1

Depth Anything V1

Foundation model for monocular depth estimation by TikTok.

Open SourceSelf HostedOffline CapableGPU Required (4GB+ VRAM)

0.0 (0)

About

Depth Anything is a foundation model for monocular depth estimation, presented at CVPR 2024 by researchers from HKU, TikTok, CUHK, and ZJU. Its training recipe combines 1.5 million labeled images with more than 62 million unlabeled ones, which gives it strong generalization to unseen scenes from a single photograph. Three variants are released, small at 24.8M parameters, base at 97.5M, and large at 335.3M, producing relative depth maps, and fine-tuned checkpoints provide metric depth for NYUv2 and KITTI. The model outperforms MiDaS v3.1 across benchmarks including Sintel, ETH3D, and DIODE, and its encoder transfers well to tasks like semantic segmentation. Adoption in the generation ecosystem is broad: a depth-conditioned ControlNet is provided, and integrations exist for Hugging Face Transformers, ONNX, TensorRT, ComfyUI, InstantID, and InvokeAI, plus community ports for Android and Jetson. The code is released under Apache 2.0, and V2 has since superseded it.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Computer Vision & Object Detection
Price: Free
Platform: Local/Desktop
Difficulty: Easy (2/5)
License: Apache-2.0
Minimum VRAM: 4 GB
Added: Apr 3, 2026

0.0 (0)

Website GitHub

Browse all Computer Vision & Object Detection tools

Depth Anything V1

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

CLIP

DeepFace

Depth Anything V2

Detectron2

DINOv2

ByteTrack