๐ฎ UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
โ242Jan 4, 2026Updated 4 months ago
Alternatives and similar repositories for UniPixel
Users that are interested in UniPixel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] Official implementation of the paper: "Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Objโฆโ78Jul 29, 2025Updated 9 months ago
- [IEEE TMI'23] IOP-FL: Inside-Outside Personalization for Federated Medical Image Segmentationโ13Apr 2, 2023Updated 3 years ago
- OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoningโ29May 24, 2025Updated 11 months ago
- [ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learningโ339Feb 9, 2026Updated 3 months ago
- Code for paper: Reinforced Vision Perception with Toolsโ72Oct 3, 2025Updated 7 months ago
- Managed Database hosting by DigitalOcean โข AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- โ12Dec 6, 2024Updated last year
- Paper List on Earth Observation in the Foundation Model Eraโ31Apr 12, 2026Updated last month
- โ77Apr 9, 2026Updated last month
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understandingโ84Jul 4, 2025Updated 10 months ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performanceโ18Sep 11, 2024Updated last year
- Official pytorch implementation for SingleInsertโ29Apr 19, 2024Updated 2 years ago
- โ31Mar 24, 2026Updated last month
- โ22Jul 15, 2024Updated last year
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentationโ36Feb 28, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition (NeurIPS 2025)โ108Feb 28, 2026Updated 2 months ago
- โ14Oct 30, 2023Updated 2 years ago
- Vision and Language Reference Prompt into SAM for Few-shot Segmentationโ32Apr 8, 2025Updated last year
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)โ26Jun 11, 2025Updated 11 months ago
- โ11Dec 23, 2024Updated last year
- [CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokensโ274Aug 2, 2025Updated 9 months ago
- (CVPR 2026) Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"โ193Apr 14, 2026Updated last month
- This is a project on visual spatial reasoning tasks-SIBenchโ26Jan 12, 2026Updated 4 months ago
- [AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Modelsโ22Mar 25, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits โข AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This repo aims to include materials (papers, codes, slides) about SAM2 (segment anything in images and videos). We are continuously improโฆโ148Oct 1, 2025Updated 7 months ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Modelsโ166Sep 12, 2024Updated last year
- Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)โ1,600May 7, 2026Updated 2 weeks ago
- โ19Aug 3, 2024Updated last year
- Open-Vocabulary SAM3D: Understand Any 3D Sceneโ41Jun 9, 2025Updated 11 months ago
- [NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"โ100Dec 21, 2025Updated 5 months ago
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Groundingโ93Dec 14, 2025Updated 5 months ago
- [NeurIPS2025 Spotlight ๐ฅ ] Official implementation of ๐ธ "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Languโฆโ273Nov 5, 2025Updated 6 months ago
- โ41Jan 14, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICRA 2025] A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Graspingโ12Feb 7, 2025Updated last year
- Masked Autoencoders for Unsupervised Anomaly Detection in Medical Imagesโ21Aug 15, 2023Updated 2 years ago
- โ15Nov 1, 2024Updated last year
- Official implementation of Add-SD: Rational Generation without Manual Reference.โ28Aug 19, 2024Updated last year
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Groundingโ128Dec 10, 2024Updated last year
- Code release for "SegLLM: Multi-round Reasoning Segmentation"โ128Feb 20, 2025Updated last year
- A curated list of papers & resources linked to concept learningโ13Aug 9, 2023Updated 2 years ago