facebookresearch / dinov3Links
Reference PyTorch implementation and models for DINOv3
☆7,393Updated this week
Alternatives and similar repositories for dinov3
Users that are interested in dinov3 are comparing it to the libraries listed below
Sorting:
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆2,805Updated 2 weeks ago
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!☆1,640Updated last week
- Official repository for "AM-RADIO: Reduce All Domains Into One"☆1,347Updated last week
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding☆1,232Updated 2 months ago
- PyTorch code and models for VJEPA2 self-supervised learning from video.☆2,225Updated last month
- Efficient vision foundation models for high-resolution generation and perception.☆3,084Updated 3 weeks ago
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆1,784Updated 3 months ago
- [ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning☆1,336Updated 3 months ago
- Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024☆1,583Updated last year
- All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.☆892Updated this week
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,342Updated 4 months ago
- SAM with text prompt☆2,396Updated last month
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆3,149Updated 4 months ago
- [NeurIPS 2025] SpatialLM: Training Large Language Models for Structured Indoor Modeling☆4,006Updated this week
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆1,036Updated 8 months ago
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆4,063Updated 2 weeks ago
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆4,851Updated 5 months ago
- RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.☆3,021Updated 2 weeks ago
- Code for "MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training", Arxiv 2025.☆1,117Updated 2 months ago
- [CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching☆2,118Updated 3 weeks ago
- [ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"☆2,733Updated 2 months ago
- Efficient Track Anything☆638Updated 8 months ago
- [NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation☆6,555Updated 8 months ago
- PyTorch code and models for the DINOv2 self-supervised learning method.☆11,638Updated last month
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆17,013Updated 9 months ago
- [CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone☆1,757Updated 2 months ago
- A suite of image and video neural tokenizers☆1,671Updated 7 months ago
- Tracking Any Point (TAP)☆1,672Updated last week
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything☆2,419Updated 9 months ago
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆8,969Updated last year