Chenfei-Liao / Multi-Modal-Semantic-Segmentation-Robustness-BenchmarkLinks
(CVPR Workshop Best Paper Award) Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness
☆15Updated 2 months ago
Alternatives and similar repositories for Multi-Modal-Semantic-Segmentation-Robustness-Benchmark
Users that are interested in Multi-Modal-Semantic-Segmentation-Robustness-Benchmark are comparing it to the libraries listed below
Sorting:
- 😎 A curated list of CVPR 2025 Oral paper. Total 96☆59Updated last month
- A paper list for spatial reasoning☆595Updated 2 weeks ago
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆313Updated 8 months ago
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆218Updated 3 weeks ago
- [NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"☆305Updated last year
- [ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs☆150Updated 3 weeks ago
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆102Updated 6 months ago
- [ICCV 2025] MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation☆47Updated 2 months ago
- 📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.☆375Updated last week
- [TPAMI 2025] Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models☆159Updated last week
- A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.☆493Updated this week
- Awsome of VLM-CL. Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting☆136Updated 3 weeks ago
- Survey: https://arxiv.org/pdf/2507.20198☆269Updated 3 weeks ago
- [ACM CSUR 2025] Understanding World or Predicting Future? A Comprehensive Survey of World Models☆381Updated last month
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆226Updated 5 months ago
- [WACV 2025] Code for Enhancing Vision-Language Few-Shot Adaptation with Negative Learning☆11Updated 10 months ago
- [NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding☆137Updated last month
- [CVPR2025] FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression☆59Updated 3 months ago
- EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models☆53Updated 3 weeks ago
- Official repo of Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics☆57Updated 4 months ago
- Official repo and evaluation implementation of VSI-Bench☆658Updated 5 months ago
- Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence☆420Updated last week
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆253Updated 3 months ago
- A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.☆739Updated last month
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆148Updated this week
- Official repository for VisionZip (CVPR 2025)☆396Updated 5 months ago
- Collection of awesome Continual Test-Time Adaptation methods☆23Updated last year
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆198Updated 8 months ago
- Repository for Vision-and-Language Navigation via Causal Learning (Accepted by CVPR 2024)☆98Updated 7 months ago
- Code for the paper "Compositional Entailment Learning for Hyperbolic Vision-Language Models".☆96Updated 6 months ago