THU-MIG / yoloe
YOLOE: Real-Time Seeing Anything
☆1,183Updated this week
Alternatives and similar repositories for yoloe:
Users that are interested in yoloe are comparing it to the libraries listed below
- YOLOv12: Attention-Centric Real-Time Object Detectors☆1,623Updated 3 weeks ago
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding☆1,023Updated 2 weeks ago
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆941Updated 3 months ago
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,268Updated this week
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆2,066Updated 2 weeks ago
- [CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence☆722Updated last month
- D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]☆1,914Updated 3 weeks ago
- RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.☆2,015Updated last week
- [CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥…☆3,597Updated last week
- Implementation for Describe Anything: Detailed Localized Image and Video Captioning☆908Updated this week
- This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detectio…☆589Updated 10 months ago
- Efficient Track Anything☆534Updated 4 months ago
- This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".☆320Updated 2 months ago
- Images to inference with no labeling (use foundation models to train supervised models).☆2,249Updated last month
- Python scripts for the Segment Anythin 2 (SAM2) model in ONNX☆243Updated 8 months ago
- Run Segment Anything Model 2 on a live video stream☆377Updated 3 months ago
- SAM with text prompt☆2,138Updated 2 weeks ago
- Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆313Updated last month
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy☆2,468Updated 2 weeks ago
- [ICCV 2023] DETRs with Collaborative Hybrid Assignments Training☆1,184Updated 4 months ago
- [CVPR 2024] Real-Time Open-Vocabulary Object Detection☆5,366Updated 2 months ago
- Quick exploration into fine tuning florence 2☆309Updated 7 months ago
- [AAAI 2025] Official PyTorch implementation of "TinySAM: Pushing the Envelope for Efficient Segment Anything Model"☆468Updated 3 months ago
- The repo for "Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator"☆551Updated 2 weeks ago
- Efficient vision foundation models for high-resolution generation and perception.☆2,829Updated last week
- Official implementation of the WACV 2025 ( Oral ) paper. RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positiv…☆152Updated last month
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆213Updated last month
- Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"☆402Updated last month
- Code for "MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training", Arxiv 2025.☆906Updated 3 months ago
- 3D object detection using YOLO and depth estimation☆193Updated last month