om-ai-lab / VLM-FO1Links
VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs
☆148Updated 2 weeks ago
Alternatives and similar repositories for VLM-FO1
Users that are interested in VLM-FO1 are comparing it to the libraries listed below
Sorting:
- Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection☆93Updated 9 months ago
- ☆52Updated 5 months ago
- YOLO-UniOW: Efficient Universal Open-World Object Detection☆170Updated 10 months ago
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆130Updated 5 months ago
- Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"☆262Updated 2 weeks ago
- A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space☆95Updated last week
- Includes the VideoCount dataset and CountVid code for the paper Open-World Object Counting in Videos.☆78Updated last month
- Official implementation of RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models☆149Updated 2 weeks ago
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆107Updated last year
- 🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)☆201Updated last month
- [NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"☆54Updated 5 months ago
- [WACV 2026] Official implementation of the paper: “CountingDINO: A Training-free Pipeline for Exemplar-based Class-Agnostic Counting”☆44Updated last month
- [NeurIPS 2024 🔥] DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model☆43Updated 11 months ago
- The source code of IEEE TPAMI 2025 "Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation".☆115Updated 11 months ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆64Updated last year
- [ICML2025] Official Implementation of CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering☆76Updated last month
- [ECCV2024] Official implementation of Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes☆92Updated 7 months ago
- X-SAM: From Segment Anything to Any Segmentation (AAAI2026)☆330Updated 2 weeks ago
- (CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…☆511Updated 3 months ago
- Use DINOv3’s powerful, self-supervised visual features + YOLOv12’s blazing-fast detection, all in one repo. Whether you have only a few h…☆161Updated 2 weeks ago
- We propose IAD-R1, a universal post-training framework that enhances Vision-Language Models for industrial anomaly detection through a tw…☆46Updated this week
- [CVPR'25] Official repo of "Point2RBox-v2:Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances"☆39Updated 4 months ago
- Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)☆981Updated 2 weeks ago
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆382Updated 9 months ago
- New generation of CLIP with fine grained discrimination capability, ICML2025☆497Updated last month
- Demo for Qwen2.5-VL-3B-Instruct on Axera device.☆17Updated 3 months ago
- The source code of IEEE TPAMI 2025 "Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation".☆203Updated 11 months ago
- The official implementation for [ACMMM25] Dome-DETR: DETR with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Dete…☆62Updated 3 months ago
- Implementation of paper - DEYO: DETR with YOLO for End-to-End Object Detection☆98Updated last year
- Make Large Multimodal Models excel in object detection, ICCV 2025☆61Updated 4 months ago