🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆231Jan 4, 2026Updated 2 months ago
Alternatives and similar repositories for UniPixel
Users that are interested in UniPixel are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] Official implementation of the paper: "Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Obj…☆74Jul 29, 2025Updated 7 months ago
- VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning☆326Feb 9, 2026Updated last month
- ☆11Dec 6, 2024Updated last year
- Paper List on Earth Observation in the Foundation Model Era☆30Mar 15, 2026Updated last week
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆83Jul 4, 2025Updated 8 months ago
- ☆30Sep 11, 2025Updated 6 months ago
- 🧠 VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)☆311Feb 8, 2026Updated last month
- Official pytorch implementation for SingleInsert☆28Apr 19, 2024Updated last year
- ☆23Jul 15, 2024Updated last year
- ☆11Dec 23, 2024Updated last year
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆30Feb 28, 2026Updated 3 weeks ago
- InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition (NeurIPS 2025)☆108Feb 28, 2026Updated 3 weeks ago
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆53Feb 10, 2025Updated last year
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)☆26Jun 11, 2025Updated 9 months ago
- Official implementation for P2SAM (ACM MM 2024)☆14Dec 7, 2024Updated last year
- [NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"☆97Dec 21, 2025Updated 3 months ago
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding☆82Dec 14, 2025Updated 3 months ago
- [AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Models☆22Mar 25, 2025Updated 11 months ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆163Sep 12, 2024Updated last year
- Official Repo For Pixel-LLM Codebase☆1,560Feb 27, 2026Updated 3 weeks ago
- ☆37Jan 14, 2025Updated last year
- ☆19Aug 3, 2024Updated last year
- Open-Vocabulary SAM3D: Understand Any 3D Scene☆40Jun 9, 2025Updated 9 months ago
- 🚀 Official code for “XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression”, …☆40Jan 27, 2026Updated last month
- Masked Autoencoders for Unsupervised Anomaly Detection in Medical Images☆21Aug 15, 2023Updated 2 years ago
- ☆15Nov 1, 2024Updated last year
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆47Updated this week
- Official Implementation of Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training☆143Mar 13, 2026Updated last week
- Official implementation of "Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation" (ICCV 2…☆80Aug 5, 2025Updated 7 months ago
- A curated list of papers & resources linked to concept learning☆12Aug 9, 2023Updated 2 years ago
- Unified Change Detection Framework☆44May 24, 2025Updated 9 months ago
- [AAAI 2026] GenMAC for Compositional Text-to-Video Generation☆32Jan 10, 2026Updated 2 months ago
- [2022 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line☆32Mar 6, 2023Updated 3 years ago
- [SIGGRAPH Asia 2025] Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization☆35Nov 30, 2025Updated 3 months ago
- [ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"☆270Dec 30, 2024Updated last year
- [CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens☆254Aug 2, 2025Updated 7 months ago
- The official GitHub page for the survey paper "A Survey on LLM Symbolic Reasoning". And this paper is under review.☆26Feb 15, 2026Updated last month
- [ICCV'25] ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment☆36Oct 5, 2025Updated 5 months ago
- Implementation of YOLO and IOU tracker in C++☆18Dec 20, 2021Updated 4 years ago