AI45Lab / IS-BenchLinks
Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
β32Updated 3 weeks ago
Alternatives and similar repositories for IS-Bench
Users that are interested in IS-Bench are comparing it to the libraries listed below
Sorting:
- MAT: Multi-modal Agent Tuning π₯ ICLR 2025 (Spotlight)β77Updated 6 months ago
- Training VLM agents with multi-turn reinforcement learningβ347Updated 2 weeks ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGIβ216Updated 2 months ago
- Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent wβ¦β88Updated 3 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safetyβ52Updated 4 months ago
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"β72Updated 2 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"β30Updated 5 months ago
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agentsβ206Updated 7 months ago
- Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)β59Updated 8 months ago
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.β236Updated last month
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoningβ121Updated last week
- β111Updated 3 months ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"β30Updated 4 months ago
- (ACL 2025) π₯π₯π₯Code for "Empowering Multimodal Large Language Models with Evol-Instruct"β19Updated 7 months ago
- Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"β59Updated 9 months ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"β51Updated this week
- β111Updated 4 months ago
- ICLR 2025 Agent-Related Papersβ74Updated last year
- Official Repository of "Learning what reinforcement learning can't"β70Updated last month
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!β53Updated 8 months ago
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"β27Updated 2 months ago
- β21Updated 4 months ago
- A paper list of Awesome Latent Space.β230Updated this week
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".β73Updated 5 months ago
- [ICML 2025] Official Implementation of GLIDERβ71Updated 2 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]β169Updated 6 months ago
- A Self-Training Framework for Vision-Language Reasoningβ88Updated 10 months ago
- β57Updated 5 months ago
- β28Updated 10 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thoughtβ97Updated 2 weeks ago