om-ai-lab / VLM-FO1View external linksLinks
VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs
☆237Nov 28, 2025Updated 2 months ago
Alternatives and similar repositories for VLM-FO1
Users that are interested in VLM-FO1 are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.☆30Nov 13, 2025Updated 3 months ago
- Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation☆15Sep 24, 2025Updated 4 months ago
- #ICCV, #MoE, #Tracking☆33Jul 11, 2025Updated 7 months ago
- [ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation,☆49Mar 20, 2025Updated 10 months ago
- Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1.☆249Aug 12, 2025Updated 6 months ago
- RefDrone: A Challenging Benchmark for Drone Scene Referring Expression Comprehension☆32Dec 23, 2025Updated last month
- Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)☆1,128Jan 25, 2026Updated 3 weeks ago
- yolov5: pytorch->onnx->caffe->hisi3559☆23Jun 5, 2024Updated last year
- ☆30Jan 18, 2026Updated 3 weeks ago
- 总结一下公开的火焰和烟雾数据集☆35Mar 29, 2025Updated 10 months ago
- Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆116Feb 7, 2026Updated last week
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆58Mar 4, 2025Updated 11 months ago
- Related paper and code list☆28Mar 1, 2021Updated 4 years ago
- [ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".☆30Dec 8, 2024Updated last year
- [NeurIPS 2024] VastTrack: Vast Category Visual Object Tracking☆73Sep 30, 2025Updated 4 months ago
- The official implementation of our work Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanc…☆12Oct 14, 2024Updated last year
- 3D Gaussian Splatting for underwater scene reconstruction via physcial-based appearance-medium decoupling☆23Updated this week
- ☆29Apr 23, 2020Updated 5 years ago
- Official Repo for PosSAM: Panoptic Open-vocabulary Segment Anything☆70Apr 7, 2024Updated last year
- ☆82Jan 18, 2026Updated 3 weeks ago
- ☆107Aug 14, 2025Updated 6 months ago
- Code for "Lightweight Infrared Small Target Detection Network Using Full-Scale Skip Connection U-Net" in IEEE GRSL 2023☆15Sep 7, 2023Updated 2 years ago
- (NeurIPS 2024) Official repository of paper "Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models"☆35Mar 22, 2025Updated 10 months ago
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- [Arxiv'24] LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding☆40Aug 18, 2025Updated 5 months ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- [ICCV2025] Harnessing CLIP, DINO and SAM for Open Vocabulary Segmentation☆106Nov 22, 2025Updated 2 months ago
- ☆41Dec 10, 2024Updated last year
- [CVPR 2023]Implementation of Siamese Image Modeling for Self-Supervised Vision Representation Learning☆41Jun 6, 2024Updated last year
- Using OnnxRuntime to inference yolov10,yolov10+SAM ,yolov10+bytetrack , SAM2 and paddleOCR by c++ .☆161Sep 25, 2025Updated 4 months ago
- 🚀🚀🚀Official code for the paper "YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection." *(YO…☆346Updated this week
- YOLO-UniOW: Efficient Universal Open-World Object Detection☆176Jan 17, 2025Updated last year
- 海思设备上部署阉割版yolov5☆13Nov 22, 2021Updated 4 years ago
- ☆13Feb 23, 2023Updated 2 years ago
- Deploy Yolo series algorithms on Hisilicon platform hi3516, including yolov3, yolov5, yolox, etc☆11Mar 25, 2022Updated 3 years ago
- ☆10Oct 23, 2017Updated 8 years ago
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- ☆10Nov 15, 2023Updated 2 years ago
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆141Jun 30, 2025Updated 7 months ago