🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆245Jan 4, 2026Updated 5 months ago
Alternatives and similar repositories for UniPixel
Users that are interested in UniPixel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [IEEE TMI'23] IOP-FL: Inside-Outside Personalization for Federated Medical Image Segmentation☆13Apr 2, 2023Updated 3 years ago
- OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoning☆29May 24, 2025Updated last year
- [ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning☆348Feb 9, 2026Updated 4 months ago
- Code for paper: Reinforced Vision Perception with Tools☆74Oct 3, 2025Updated 8 months ago
- ☆12Dec 6, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Paper List on Earth Observation in the Foundation Model Era☆31Jun 15, 2026Updated 2 weeks ago
- 3D BBox refinement interface used in LabelAny3D (NeurIPS 2025)☆22Jan 6, 2026Updated 5 months ago
- ☆78Apr 9, 2026Updated 2 months ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated 11 months ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆18Sep 11, 2024Updated last year
- 🧠 VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)☆343Feb 8, 2026Updated 4 months ago
- ☆31Mar 24, 2026Updated 3 months ago
- ☆22Jul 15, 2024Updated last year
- InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition (NeurIPS 2025)☆112Feb 28, 2026Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆14Oct 30, 2023Updated 2 years ago
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆56Feb 10, 2025Updated last year
- Differentiable Hierarchical Visual Tokenization☆45Nov 26, 2025Updated 7 months ago
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)☆26Jun 11, 2025Updated last year
- ☆12Dec 13, 2024Updated last year
- Official Repo of "Flow-OPD: On-Policy Distillation for Flow Matching Models"☆248Updated this week
- Official implementation for P2SAM (ACM MM 2024)☆14Dec 7, 2024Updated last year
- [CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens☆286Aug 2, 2025Updated 10 months ago
- [AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Models☆22Mar 25, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆166Sep 12, 2024Updated last year
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆230Jun 7, 2026Updated 3 weeks ago
- Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)☆1,617Jun 19, 2026Updated last week
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding☆96Dec 14, 2025Updated 6 months ago
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆277Nov 5, 2025Updated 7 months ago
- Open-Vocabulary SAM3D: Understand Any 3D Scene☆44Jun 9, 2025Updated last year
- Masked Autoencoders for Unsupervised Anomaly Detection in Medical Images☆21Aug 15, 2023Updated 2 years ago
- ☆15Nov 1, 2024Updated last year
- Official implementation of Add-SD: Rational Generation without Manual Reference.☆28Aug 19, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding☆129Dec 10, 2024Updated last year
- [CVPR26] Official code for GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristic☆91Mar 24, 2026Updated 3 months ago
- A curated list of papers & resources linked to concept learning☆13Aug 9, 2023Updated 2 years ago
- Understand what physics/algorithms do transformers learn internally when trained on planetary motion☆43Feb 9, 2026Updated 4 months ago
- 🚀 Official code for “XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression”, …☆46Jan 27, 2026Updated 5 months ago
- [ICLR 2024] Official implementation for the paper "Continuous Field Reconstruction from Sparse Observations with Implicit Neural Networks…☆19Aug 29, 2025Updated 10 months ago
- This repo takes the initial step towards leveraging text learning for online action detection without explicit human supervision.☆15Dec 13, 2024Updated last year