๐ฎ UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
โ241Jan 4, 2026Updated 5 months ago
Alternatives and similar repositories for UniPixel
Users that are interested in UniPixel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [IEEE TMI'23] IOP-FL: Inside-Outside Personalization for Federated Medical Image Segmentationโ13Apr 2, 2023Updated 3 years ago
- OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoningโ29May 24, 2025Updated last year
- [ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learningโ347Feb 9, 2026Updated 4 months ago
- โ12Dec 6, 2024Updated last year
- Paper List on Earth Observation in the Foundation Model Eraโ31Apr 12, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understandingโ83Jul 4, 2025Updated 11 months ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performanceโ18Sep 11, 2024Updated last year
- ๐ง VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)โ340Feb 8, 2026Updated 4 months ago
- Official pytorch implementation for SingleInsertโ29Apr 19, 2024Updated 2 years ago
- โ31Mar 24, 2026Updated 2 months ago
- โ22Jul 15, 2024Updated last year
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentationโ36Feb 28, 2026Updated 3 months ago
- โ14Oct 30, 2023Updated 2 years ago
- Vision and Language Reference Prompt into SAM for Few-shot Segmentationโ32Apr 8, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Differentiable Hierarchical Visual Tokenizationโ45Nov 26, 2025Updated 6 months ago
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)โ26Jun 11, 2025Updated 11 months ago
- โ12Dec 13, 2024Updated last year
- Official Repo of "Flow-OPD: On-Policy Distillation for Flow Matching Models"โ226Updated this week
- โ11Dec 23, 2024Updated last year
- [CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokensโ282Aug 2, 2025Updated 10 months ago
- Toy-scale unified multimodal model experiments โ encoder-free understanding & generation with Mixture-of-Transformers on MLX/Apple Silicoโฆโ43Mar 8, 2026Updated 3 months ago
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"โ213Updated this week
- This is a project on visual spatial reasoning tasks-SIBenchโ26Jan 12, 2026Updated 4 months ago
- Managed Database hosting by DigitalOcean โข AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Modelsโ22Mar 25, 2025Updated last year
- This repo aims to include materials (papers, codes, slides) about SAM2 (segment anything in images and videos). We are continuously improโฆโ150Oct 1, 2025Updated 8 months ago
- Official implementation for TAO (CVPR 2025)โ20Jan 1, 2026Updated 5 months ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Modelsโ166Sep 12, 2024Updated last year
- Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)โ1,609May 21, 2026Updated 3 weeks ago
- SMART-ELEๆบๆ งๆฐๅญๅญช็ๅ็ต็ซโ31Aug 10, 2025Updated 10 months ago
- [NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"โ100Dec 21, 2025Updated 5 months ago
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Groundingโ94Dec 14, 2025Updated 5 months ago
- [NeurIPS2025 Spotlight ๐ฅ ] Official implementation of ๐ธ "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Languโฆโ273Nov 5, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- โ42Jan 14, 2025Updated last year
- [ICRA 2025] A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Graspingโ12Feb 7, 2025Updated last year
- Open-Vocabulary SAM3D: Understand Any 3D Sceneโ44Jun 9, 2025Updated last year
- Official implementation of Add-SD: Rational Generation without Manual Reference.โ28Aug 19, 2024Updated last year
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Groundingโ128Dec 10, 2024Updated last year
- A curated list of papers & resources linked to concept learningโ13Aug 9, 2023Updated 2 years ago
- [CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compressionโ67Updated this week