AI45Lab / IS-BenchLinks
Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
☆33Updated last month
Alternatives and similar repositories for IS-Bench
Users that are interested in IS-Bench are comparing it to the libraries listed below
Sorting:
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆234Updated 2 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆52Updated 5 months ago
- A paper list of Awesome Latent Space.☆276Updated last week
- MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)☆81Updated 3 weeks ago
- Training VLM agents with multi-turn reinforcement learning☆365Updated last week
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆30Updated 6 months ago
- Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)☆63Updated 8 months ago
- Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"☆60Updated 10 months ago
- ICLR 2025 Agent-Related Papers☆74Updated last year
- ☆21Updated 5 months ago
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.☆249Updated 2 months ago
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆211Updated 8 months ago
- [NeurIPS 2025 Spotlight] Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning.☆101Updated 3 weeks ago
- ☆16Updated 2 months ago
- ☆112Updated 3 months ago
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Updated 3 weeks ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"☆58Updated 3 weeks ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆136Updated 2 weeks ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!☆53Updated 9 months ago
- Official repo of Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics☆56Updated 4 months ago
- Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…☆95Updated 4 months ago
- [ICML 2025 Oral] The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchma…☆69Updated 5 months ago
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆73Updated 3 months ago
- Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"☆20Updated 11 months ago
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆252Updated 3 months ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆73Updated 7 months ago
- Official codebase for the paper Latent Visual Reasoning☆76Updated 2 months ago
- ☆112Updated 5 months ago
- 🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.☆118Updated last week
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆127Updated 3 weeks ago