yongliu20 / UniLSegLinks
[CVPR 2024] Official implementation of "Universal Segmentation at Arbitrary Granularity with Language Instruction"
β284Updated last year
Alternatives and similar repositories for UniLSeg
Users that are interested in UniLSeg are comparing it to the libraries listed below
Sorting:
- (ICML 2024) Spider: A Unified Framework for Context-dependent Concept Segmentationβ353Updated 10 months ago
- π₯[NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomaliesβ224Updated 9 months ago
- [ICLR 2025] Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modelingβ82Updated 3 weeks ago
- β95Updated 2 years ago
- High Quality Video Reasoning Segmentationβ144Updated 2 months ago
- π₯ [AAAI 2026 Oral] Official code for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptatβ¦β75Updated last year
- Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Betterβ186Updated this week
- Official Repository of OmniCaptionerβ168Updated 9 months ago
- [Tutorial] Few-Step Distillation for Text-to-Image Generation: A Practical Guideβ337Updated last month
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].β274Updated last year
- π₯ OneThinker: All-in-one Reasoning Model for Image and Videoβ388Updated 3 weeks ago
- The official repository of SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimizationβ156Updated last week
- β207Updated 6 months ago
- [NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportationβ115Updated last year
- [ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.β110Updated 9 months ago
- **Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.β346Updated 3 months ago
- π¦ Yo'Chameleon: Your Personalized Chameleon (CVPR 2025)β150Updated 8 months ago
- [CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"β227Updated last year
- CoS: Chain-of-Shot Prompting for Long Video Understandingβ53Updated 11 months ago
- [ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"β461Updated last year
- Data and sample evaluation codes for Multimodal Rewardbench 2β133Updated last month
- [ACM MM'2024] Official repository for "Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval"β42Updated last year
- PySegMetrics (PSM): A Python-based Simple yet Efficient Evaluation Toolbox for Segmentation-like tasksβ122Updated last year
- [NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understandingβ350Updated last month
- (ICCV-2025 Official Code)) Improving Generalist Model with Domain-Specific Expertsβ87Updated 3 months ago
- OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generationβ255Updated 4 months ago
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mulβ¦β101Updated 6 months ago
- π WorldLens: Full-Spectrum Evaluations of Driving World Models in Real Worldβ177Updated 2 weeks ago
- [NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Modelsβ215Updated 3 months ago
- MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDEβ1,087Updated 3 months ago