shengyin1224 / SafeAgentBenchLinks
Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"
☆43Updated 4 months ago
Alternatives and similar repositories for SafeAgentBench
Users that are interested in SafeAgentBench are comparing it to the libraries listed below
Sorting:
- ☆20Updated last month
- ☆50Updated last month
- Official repo of Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics☆30Updated 3 months ago
- Official code for the paper: Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld☆57Updated 9 months ago
- ☆131Updated last year
- Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)☆213Updated 4 months ago
- HAZARD challenge☆36Updated 2 months ago
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.☆144Updated this week
- This is the official repository for the ICLR 2025 accepted paper Badrobot: Manipulating Embodied LLMs in the Physical World.☆28Updated 2 weeks ago
- ICLR 2025 Agent-Related Papers☆70Updated 8 months ago
- LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents (ICLR 2024)☆78Updated last month
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆47Updated last month
- ☆46Updated 5 months ago
- [ICML 2024] The offical Implementation of "DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning"☆82Updated last month
- Official code release of AAAI 2024 paper SayCanPay.☆49Updated last year
- Official Implementation of ReALFRED (ECCV'24)☆42Updated 9 months ago
- [CVPR2024] This is the official implement of MP5☆103Updated last year
- [IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simula…☆91Updated last month
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)☆152Updated 2 weeks ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆20Updated 3 weeks ago
- MuMA-ToM: Multi-modal Multi-Agent Theory of Mind☆29Updated 5 months ago
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆37Updated 4 months ago
- Evaluate Multimodal LLMs as Embodied Agents☆52Updated 5 months ago
- Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents☆38Updated 2 months ago
- Evaluating Durability: Benchmark Insights into Multimodal Watermarking☆10Updated last year
- ProgPrompt for Virtualhome☆138Updated 2 years ago
- Implementation of the MATRIX framework (ICML 2024)☆56Updated last year
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆132Updated 3 months ago
- ☆30Updated 9 months ago
- [arXiv 2023] Embodied Task Planning with Large Language Models☆188Updated last year