ZhangHangTao / BadRobotLinks

This is the official repository for the ICLR 2025 accepted paper Badrobot: Manipulating Embodied LLMs in the Physical World.

☆40

Alternatives and similar repositories for BadRobot

Users that are interested in BadRobot are comparing it to the libraries listed below

Sorting:

ZhangHangTao / Awesome-Embodied-AI-Safety
Focused on the safety and security of Embodied AI
☆93Updated last month
William-wAng618 / roboticAttack
Official repo of Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
☆63Updated 5 months ago
shengyin1224 / SafeAgentBench
Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"
☆62Updated 11 months ago
zihao-ai / EARBench
Benchmarking Physical Risk Awareness of Foundation Model-based Embodied AI Agents
☆23Updated last year
roywang021 / IDEATOR
Code for ICCV2025 paper——IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves
☆16Updated 6 months ago
eliotjones1 / robogcg
Official GitHub repository for the paper "Adversarial Attacks on Robotic Vision Language Action Models"
☆27Updated 8 months ago
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
☆70Updated 10 months ago
thu-ml / MLA-Trust
A toolbox for benchmarking Multimodal LLM Agents trustworthiness across truthfulness, controllability, safety and privacy dimensions thro…
☆62Updated 3 weeks ago
ChenWu98 / agent-attack
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆122Updated 11 months ago
wonderNefelibata / Awesome-LRM-Safety
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆81Updated last week
ASTRAL-Group / ASTRA
[CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…
☆48Updated 6 months ago
AI45Lab / VLSBench
[ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety
☆53Updated 6 months ago
ebagdasa / adversarial_illusions
Code for "Adversarial Illusions in Multi-Modal Embeddings"
☆31Updated last year
RUCAIBox / HADES
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆34Updated last year
SaFo-Lab / AGrail4Agent
[ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".
☆32Updated 5 months ago
Alibaba-AAIG / Oyster
The Oyster series is a set of safety models developed in-house by Alibaba-AAIG, devoted to building a responsible AI ecosystem. | Oyster …
☆58Updated 4 months ago
Sizhe-Chen / StruQ
official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries
☆60Updated 2 months ago
NY1024 / Foundation-Model-Paper-Notes
☆73Updated last week
umd-huang-lab / VLM-Poisoning
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆58Updated last year
CryptoAILab / FigStep
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆190Updated 7 months ago
AI4Good24 / PsySafe
☆51Updated 11 months ago
erfanshayegani / Jailbreak-In-Pieces
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆78Updated last year
thu-ml / MMTrustEval
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
☆174Updated 7 months ago
SaFo-Lab / JailBreakV_28K
[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…
☆85Updated 8 months ago
Haochen-Luo / CroPA
☆54Updated last year
AI-secure / AgentPoison
[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"
☆193Updated 9 months ago
chen37058 / Red-Team-Arxiv-Paper-Update
Awesome Jailbreak, red teaming arxiv papers (Automatically Update Every 12th hours)
☆89Updated this week
SproutNan / AI-Safety_SCAV
This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"
☆47Updated 3 months ago
roywang021 / UMK
Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models
☆31Updated last year
ShiJiawenwen / JudgeDeceiver
[CCS 2024] Optimization-based Prompt Injection Attack to LLM-as-a-Judge
☆39Updated 4 months ago