AI45Lab / IS-BenchLinks
Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
☆18Updated last week
Alternatives and similar repositories for IS-Bench
Users that are interested in IS-Bench are comparing it to the libraries listed below
Sorting:
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆105Updated last month
- ☆103Updated last month
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆158Updated 3 months ago
- repo for paper https://arxiv.org/abs/2504.13837☆180Updated last month
- ☆52Updated last month
- Official Repository of "Learning what reinforcement learning can't"☆54Updated this week
- ☆155Updated 2 months ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!☆47Updated 4 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆48Updated 2 weeks ago
- ICLR 2025 Agent-Related Papers☆71Updated 8 months ago
- ☆197Updated this week
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆53Updated last week
- Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)☆37Updated 3 months ago
- ☆323Updated last week
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆142Updated 2 weeks ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆68Updated 4 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆271Updated 3 weeks ago
- ☆26Updated 6 months ago
- [arXiv] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆36Updated 2 months ago
- A comprehensive collection of process reward models.☆99Updated 2 weeks ago
- ☆255Updated last month
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆282Updated 3 weeks ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆83Updated 2 months ago
- A Self-Training Framework for Vision-Language Reasoning☆80Updated 6 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆21Updated last month
- A paper list for spatial reasoning☆127Updated last month
- Segment Policy Optimization: Improved Credit Assignment in Reinforcement Learning for LLMs☆27Updated 2 weeks ago
- AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)☆293Updated last month
- Collections of Papers and Projects for Multimodal Reasoning.☆105Updated 3 months ago
- Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…☆68Updated last week