AI45Lab / IS-BenchLinks
[AAAI 2026] Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
☆39Updated 2 months ago
Alternatives and similar repositories for IS-Bench
Users that are interested in IS-Bench are comparing it to the libraries listed below
Sorting:
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆248Updated 3 months ago
- Training VLM agents with multi-turn reinforcement learning☆381Updated this week
- MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)☆83Updated last month
- ICLR 2025 Agent-Related Papers☆75Updated last year
- A paper list of Awesome Latent Space.☆305Updated last week
- ☆112Updated 4 months ago
- Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)☆66Updated 9 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆140Updated 3 weeks ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"☆62Updated last month
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆30Updated 7 months ago
- Towards Efficient Multimodal Large Language Models: A Survey on Token Compression☆78Updated 2 weeks ago
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆217Updated 8 months ago
- A Self-Training Framework for Vision-Language Reasoning☆88Updated last year
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.☆260Updated 3 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆53Updated 6 months ago
- Official Repository of LatentSeek☆76Updated 7 months ago
- ☆16Updated 3 months ago
- [ACL 2025] "World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning." https://arxiv.org/abs/2503.1…☆16Updated 6 months ago
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆128Updated this week
- [ICML 2025] Official Implementation of GLIDER☆72Updated 3 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆179Updated 7 months ago
- ☆21Updated 6 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆104Updated 4 months ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!☆53Updated 10 months ago
- ☆114Updated 6 months ago
- Official Repository of "Learning what reinforcement learning can't"☆79Updated last month
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆83Updated 6 months ago
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆73Updated 4 months ago
- Official codebase for the paper Latent Visual Reasoning☆98Updated 3 months ago
- 🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.☆130Updated 2 weeks ago