IDEA-Research/DINO-X-API

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IDEA-Research/DINO-X-API)

IDEA-Research / DINO-X-API

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

☆1,399

Alternatives and similar repositories for DINO-X-API

Users that are interested in DINO-X-API are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IDEA-Research / Grounded-SAM-2
View on GitHub
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
☆3,647Nov 11, 2025Updated 8 months ago
IDEA-Research / Grounding-DINO-1.5-API
View on GitHub
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
☆1,139Jan 21, 2025Updated last year
IDEA-Research / GroundingDINO
View on GitHub
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
☆10,429Aug 12, 2024Updated last year
IDEA-Research / Rex-Omni
View on GitHub
[CVPR2026] Detect Anything via Next Point Prediction
☆1,514Feb 22, 2026Updated 5 months ago
THU-MIG / yoloe
View on GitHub
YOLOE: Real-Time Seeing Anything [ICCV 2025]
☆2,212Jun 26, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / sam2
View on GitHub
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…
☆19,576May 30, 2026Updated last month
facebookresearch / dinov3
View on GitHub
Reference PyTorch implementation and models for DINOv3
☆10,988Jul 15, 2026Updated last week
IDEA-Research / T-Rex
View on GitHub
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
☆2,689Oct 15, 2025Updated 9 months ago
AILab-CVC / YOLO-World
View on GitHub
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
☆6,476Feb 26, 2025Updated last year
UX-Decoder / DINOv
View on GitHub
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
☆542Apr 8, 2024Updated 2 years ago
IDEA-Research / Grounded-Segment-Anything
View on GitHub
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …
☆17,688Sep 5, 2024Updated last year
facebookresearch / dinov2
View on GitHub
PyTorch code and models for the DINOv2 self-supervised learning method.
☆13,146Jun 3, 2026Updated last month
IDEA-Research / RexSeek
View on GitHub
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
☆184Oct 15, 2025Updated 9 months ago
NVlabs / RADIO
View on GitHub
Official repository for "AM-RADIO: Reduce All Domains Into One"
☆1,898May 29, 2026Updated last month
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
iSEE-Laboratory / LLMDet
View on GitHub
(CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…
☆606Feb 4, 2026Updated 5 months ago
facebookresearch / perception_models
View on GitHub
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
☆2,328Apr 13, 2026Updated 3 months ago
microsoft / GLIP
View on GitHub
Grounded Language-Image Pre-training
☆2,605Jan 24, 2024Updated 2 years ago
siyuanliii / masa
View on GitHub
Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything
☆1,376May 1, 2025Updated last year
wanghao9610 / OV-DINO
View on GitHub
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
☆408Mar 12, 2025Updated last year
yangchris11 / samurai
View on GitHub
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
☆7,104Mar 18, 2025Updated last year
UX-Decoder / Semantic-SAM
View on GitHub
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
☆2,853Jul 10, 2025Updated last year
UX-Decoder / Segment-Everything-Everywhere-All-At-Once
View on GitHub
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
☆4,795Aug 19, 2024Updated last year
facebookresearch / sam3
View on GitHub
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading t…
☆11,021Jul 15, 2026Updated last week
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
xinyu1205 / recognize-anything
View on GitHub
Open-source and strong foundation image recognition models.
☆3,690Feb 18, 2025Updated last year
baaivision / tokenize-anything
View on GitHub
[ECCV 2024] Tokenize Anything via Prompting
☆601Dec 11, 2024Updated last year
DepthAnything / Depth-Anything-V2
View on GitHub
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
☆8,515Mar 24, 2026Updated 3 months ago
HarborYuan / ovsam
View on GitHub
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
☆1,031Aug 4, 2025Updated 11 months ago
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆19,645Jan 30, 2026Updated 5 months ago
om-ai-lab / VLM-R1
View on GitHub
Solve Visual Understanding with Reinforced VLMs
☆6,013Jul 7, 2026Updated 2 weeks ago
xuxw98 / ESAM
View on GitHub
[ICLR 2025, Oral] EmbodiedSAM: Online Segment Any 3D Thing in Real Time
☆634May 7, 2025Updated last year
CASIA-LMC-Lab / FastSAM
View on GitHub
Fast Segment Anything
☆8,381Jul 30, 2024Updated last year
IDEA-Research / OpenSeeD
View on GitHub
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
☆762Jan 22, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LiheYoung / Depth-Anything
View on GitHub
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
☆8,166Jul 17, 2024Updated 2 years ago
microsoft / MoGe
View on GitHub
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
☆2,659Updated this week
yformer / EfficientSAM
View on GitHub
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
☆2,485Dec 24, 2024Updated last year
facebookresearch / vggt
View on GitHub
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
☆13,950May 19, 2026Updated 2 months ago
bytedance / Sa2VA
View on GitHub
Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)
☆1,650Jun 19, 2026Updated last month
Intellindust-AI-Lab / DEIMv2
View on GitHub
[DEIMv2] Real Time Object Detection Meets DINOv3
☆1,937Mar 24, 2026Updated 3 months ago
NVlabs / describe-anything
View on GitHub
[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
☆1,505Jun 26, 2025Updated last year