om-ai-lab / VLM-FO1Links
VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs
☆78Updated this week
Alternatives and similar repositories for VLM-FO1
Users that are interested in VLM-FO1 are comparing it to the libraries listed below
Sorting:
- YOLO-UniOW: Efficient Universal Open-World Object Detection☆166Updated 10 months ago
- Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection☆93Updated 8 months ago
- ☆51Updated 4 months ago
- Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"☆254Updated this week
- Make Large Multimodal Models excel in object detection, ICCV 2025☆53Updated 3 months ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆64Updated last year
- A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space☆91Updated 10 months ago
- [ECCV2024] Official implementation of Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes☆92Updated 6 months ago
- [NeurIPS 2024 Spotlight ⭐️ & TPAMI 2025] Parameter-Inverted Image Pyramid Networks (PIIP)☆105Updated 3 months ago
- 🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)☆190Updated last month
- [NeurIPS2025 Workshop] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"☆52Updated 4 months ago
- ☆22Updated 10 months ago
- [ECCV2024 Oral] Official implementation of the paper "Relation DETR: Exploring Explicit Position Relation Prior for Object Detection"☆245Updated 11 months ago
- [ECCV 2024] Official implementation of "LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction"☆86Updated 7 months ago
- Vision Manus: Your versatile Visual AI assistant☆297Updated last month
- [ICCV 2025] Official implementation of the paper: "Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Obj…☆70Updated 3 months ago
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆142Updated 5 months ago
- X-SAM: From Segment Anything to Any Segmentation (AAAI2026)☆321Updated 2 weeks ago
- OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆378Updated 8 months ago
- [CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆176Updated 11 months ago
- New generation of CLIP with fine grained discrimination capability, ICML2025☆472Updated 3 weeks ago
- ☆95Updated 3 months ago
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆128Updated 4 months ago
- (CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…☆491Updated 3 months ago
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆248Updated 2 weeks ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆55Updated 8 months ago
- The official implementation of [CVPR 2025] "5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks".☆382Updated 5 months ago
- [ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".☆29Updated 11 months ago
- InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition (NeurIPS 2025)☆101Updated last month
- ☆62Updated 3 weeks ago