deepmancer / clip-object-detectionLinks
Zero-shot object detection with CLIP, utilizing Faster R-CNN for region proposals.
☆41Updated last year
Alternatives and similar repositories for clip-object-detection
Users that are interested in clip-object-detection are comparing it to the libraries listed below
Sorting:
- All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.☆1,290Updated this week
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything☆1,362Updated 9 months ago
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding☆1,334Updated 6 months ago
- A distilled Segment Anything (SAM) model capable of running real-time with NVIDIA TensorRT☆854Updated 2 years ago
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆1,083Updated last year
- Run Segment Anything Model 2 on a live video stream☆564Updated 8 months ago
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆301Updated 7 months ago
- [DEIMv2] Real Time Object Detection Meets DINOv3☆1,463Updated last month
- Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2☆3,255Updated 2 months ago
- Official repository for "AM-RADIO: Reduce All Domains Into One"☆1,505Updated last week
- A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..☆827Updated 3 weeks ago
- This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detectio…☆791Updated 6 months ago
- ☆61Updated 2 years ago
- YOLOE: Real-Time Seeing Anything [ICCV 2025]☆2,029Updated 7 months ago
- Downstream-Dino-V2: A GitHub repository featuring an easy-to-use implementation of the DINOv2 model by Facebook for downstream tasks such…☆269Updated 2 years ago
- Efficient Track Anything☆775Updated last year
- SAM with text prompt☆2,532Updated 5 months ago
- [ICCV 2023] Tracking Anything with Decoupled Video Segmentation☆1,485Updated 9 months ago
- This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]☆742Updated 8 months ago
- (CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…☆555Updated this week
- Images to inference with no labeling (use foundation models to train supervised models).☆2,616Updated 8 months ago
- CoRL 2024☆464Updated last year
- A curated list of papers, datasets and resources pertaining to open vocabulary object detection.☆399Updated 8 months ago
- Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024☆1,629Updated last year
- (TPAMI 2024) A Survey on Open Vocabulary Learning☆986Updated last month
- [CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).☆526Updated 3 months ago
- 3D object detection using YOLO and depth estimation☆390Updated 10 months ago
- This is a repository for listing papers on scene graph generation and application.☆579Updated 3 weeks ago
- Testing adaptation of the DINOv2/3 encoders for vision tasks with Low-Rank Adaptation (LoRA)☆431Updated 3 months ago
- Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video” (ECCV 2024)☆549Updated last year