IDEA-Research / RexSeekLinks
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
☆138Updated 3 months ago
Alternatives and similar repositories for RexSeek
Users that are interested in RexSeek are comparing it to the libraries listed below
Sorting:
- Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding☆194Updated 5 months ago
- [CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"☆288Updated 2 weeks ago
- CAVIS: Context-Aware Video Instance Segmentation☆86Updated 2 weeks ago
- Scaling Vision Pre-Training to 4K Resolution☆190Updated last month
- ☆186Updated last month
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning☆88Updated 2 weeks ago
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆257Updated 2 weeks ago
- [ICLR 2025 oral] RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything☆253Updated 3 months ago
- [ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"☆243Updated 6 months ago
- ☆179Updated 9 months ago
- ☆120Updated last year
- Official Code for Tracking Any Object Amodally☆118Updated last year
- (NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection☆117Updated last year
- InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024☆81Updated last year
- Includes the VideoCount dataset and CountVid code for the paper Open-World Object Counting in Videos.☆49Updated last week
- Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"☆429Updated 3 months ago
- (ICCV 2025) ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations☆53Updated last week
- [ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces☆239Updated 5 months ago
- Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.☆125Updated 11 months ago
- [CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloadin…☆226Updated 9 months ago
- [CVPR'24 Highlight] PyTorch Implementation of Object Recognition as Next Token Prediction☆180Updated 2 months ago
- Odd-One-Out: Anomaly Detection by Comparing with Neighbors (CVPR25)☆43Updated 7 months ago
- Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆156Updated 7 months ago
- Recognize Any Regions☆122Updated 6 months ago
- The official implement of "VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning"☆222Updated last month
- AutoTrackAnything is a universal, flexible and interactive tool for insane automatic object tracking over thousands of frames. It is deve…☆84Updated last year
- [CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection☆177Updated 3 months ago
- 🏄 [ICLR 2025] OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer☆66Updated 2 weeks ago
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆65Updated last month
- Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models☆199Updated 6 months ago