iSEE-Laboratory / ReferDINO
The official implementation of the paper "ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations".
☆35Updated 2 months ago
Alternatives and similar repositories for ReferDINO:
Users that are interested in ReferDINO are comparing it to the libraries listed below
- [ECCV 2024] Decomposition Betters Tracking Everything Everywhere☆113Updated 8 months ago
- [CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"☆123Updated last week
- CAVIS: Context-Aware Video Instance Segmentation☆84Updated 3 months ago
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆49Updated 2 months ago
- Implementation of Zero-Shot Video Semantic Segmentation [CVPR 2025]☆44Updated last month
- [NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of …☆67Updated last year
- Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation☆55Updated last week
- Official pytorch implementation of "XHand: Real-time Expressive Hand Avatar"☆75Updated 8 months ago
- ☆33Updated 3 months ago
- Official repository for "Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation" (ICLR2025)☆68Updated last week
- Official Implementation of AuraFusion360☆60Updated 3 weeks ago
- Official implementation of "Local All-Pair Correspondence for Point Tracking" (ECCV 2024)☆163Updated 4 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆105Updated 2 weeks ago
- Official Code for "MITracker: Multi-View Integration for Visual Object Tracking"☆49Updated last week
- Official implementation of "Exploring Temporally-Aware Features for Point Tracking" (CVPR 2025)☆65Updated 3 weeks ago
- Scaling Vision Pre-Training to 4K Resolution☆93Updated last week
- [CVPR 2025] Official code for Using Diffusion Priors for Video Amodal Segmentation☆64Updated last week
- Official implementation of “4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models” (CVPR 2025)☆76Updated 2 weeks ago
- The official implementation of "CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities". (arXiv 2501.08983)☆87Updated 2 months ago
- ☆33Updated last week
- 🏄 [ICLR 2025] OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer☆42Updated last week
- Unifying 2D and 3D Vision-Language Understanding☆49Updated last week
- Official PyTorch implementation of Self-Supervised Any-Point Tracking by Contrastive Random Walks, ECCV 2024.☆50Updated 4 months ago
- Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆89Updated last week
- ☆38Updated last month
- ☆25Updated 2 months ago
- [WACV 2025] Efficient Video Object Segmentation via Modulated Cross-Attention Memory☆54Updated last month
- ☆22Updated last week
- ☆88Updated 2 months ago
- This is the project page of ShowRoom3D☆25Updated last year