🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆241Jan 4, 2026Updated 3 months ago
Alternatives and similar repositories for UniPixel
Users that are interested in UniPixel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] Official implementation of the paper: "Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Obj…☆77Jul 29, 2025Updated 9 months ago
- [IEEE TMI'23] IOP-FL: Inside-Outside Personalization for Federated Medical Image Segmentation☆12Apr 2, 2023Updated 3 years ago
- OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoning☆29May 24, 2025Updated 11 months ago
- VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning☆336Feb 9, 2026Updated 2 months ago
- Code for paper: Reinforced Vision Perception with Tools☆72Oct 3, 2025Updated 6 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12Dec 6, 2024Updated last year
- Paper List on Earth Observation in the Foundation Model Era☆31Apr 12, 2026Updated 2 weeks ago
- ☆77Apr 9, 2026Updated 3 weeks ago
- ☆31Mar 24, 2026Updated last month
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year
- 🧠 VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)☆326Feb 8, 2026Updated 2 months ago
- Official pytorch implementation for SingleInsert☆28Apr 19, 2024Updated 2 years ago
- ☆22Jul 15, 2024Updated last year
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆35Feb 28, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition (NeurIPS 2025)☆110Feb 28, 2026Updated 2 months ago
- Vision and Language Reference Prompt into SAM for Few-shot Segmentation☆30Apr 8, 2025Updated last year
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆55Feb 10, 2025Updated last year
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)☆26Jun 11, 2025Updated 10 months ago
- Official implementation for P2SAM (ACM MM 2024)☆14Dec 7, 2024Updated last year
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"☆189Apr 14, 2026Updated 2 weeks ago
- This is a project on visual spatial reasoning tasks-SIBench☆26Jan 12, 2026Updated 3 months ago
- [AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Models☆22Mar 25, 2025Updated last year
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆166Sep 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Platform + GUI for hyperparameter optimization of recurrent neural networks (MATLAB).☆10Dec 29, 2021Updated 4 years ago
- ☆19Aug 3, 2024Updated last year
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding☆91Dec 14, 2025Updated 4 months ago
- Open-Vocabulary SAM3D: Understand Any 3D Scene☆41Jun 9, 2025Updated 10 months ago
- ☆39Jan 14, 2025Updated last year
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆273Nov 5, 2025Updated 5 months ago
- Masked Autoencoders for Unsupervised Anomaly Detection in Medical Images☆21Aug 15, 2023Updated 2 years ago
- ☆15Nov 1, 2024Updated last year
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding☆127Dec 10, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- [CVPR26] Official code for GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristic☆83Mar 24, 2026Updated last month
- A curated list of papers & resources linked to concept learning☆13Aug 9, 2023Updated 2 years ago
- ☆143Feb 13, 2026Updated 2 months ago
- Unified Change Detection Framework☆43May 24, 2025Updated 11 months ago
- 🚀 Official code for “XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression”, …☆45Jan 27, 2026Updated 3 months ago
- [2022 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line☆32Mar 6, 2023Updated 3 years ago
- [ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"☆270Dec 30, 2024Updated last year