ChenWu98/agent-attack

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ChenWu98/agent-attack)

ChenWu98 / agent-attack

[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents

☆139

Alternatives and similar repositories for agent-attack

Users that are interested in agent-attack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OSU-NLP-Group / RedTeamCUA
View on GitHub
[ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
☆57Feb 9, 2026Updated 5 months ago
ZZZhr-1 / Robust_GUI_Grounding
View on GitHub
On the Robustness of GUI Grounding Models Against Image Attacks
☆12Apr 8, 2025Updated last year
OSU-NLP-Group / EIA_against_webagent
View on GitHub
☆40Oct 2, 2024Updated last year
OSU-NLP-Group / AgentSafety
View on GitHub
☆191Oct 31, 2025Updated 8 months ago
yunqing-me / AttackVLM
View on GitHub
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
☆231Dec 22, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
uiuc-kang-lab / InjecAgent
View on GitHub
☆152Jul 2, 2024Updated 2 years ago
erfanshayegani / Jailbreak-In-Pieces
View on GitHub
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆93Jun 6, 2024Updated 2 years ago
SALT-NLP / PopupAttack
View on GitHub
Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups
☆51Dec 23, 2024Updated last year
Shelton1013 / Chain_of_Attack
View on GitHub
[CVPR'25]Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks
☆32Jun 12, 2025Updated last year
thu-ml / Attack-Bard
View on GitHub
☆108Feb 16, 2024Updated 2 years ago
liudaizong / Awesome-LVLM-Attack
View on GitHub
😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.
☆567Updated this week
pasquini-dario / LLM_NeuralExec
View on GitHub
Code to generate NeuralExecs (prompt injection for LLMs)
☆27Oct 5, 2025Updated 9 months ago
AI-secure / AdvAgent
View on GitHub
☆25May 28, 2025Updated last year
jiamingzhang94 / AnyAttack
View on GitHub
CVPR 2025 - Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
☆74Aug 7, 2025Updated 11 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
cua-framework / agents
View on GitHub
☆22Jan 30, 2026Updated 5 months ago
Rookie143 / BadRobot
View on GitHub
This is the official repository for the ICLR 2025 accepted paper Badrobot: Manipulating Embodied LLMs in the Physical World.
☆46Jun 11, 2026Updated last month
TeamPigeonLab / CS-DJ
View on GitHub
Accept by CVPR 2025 (highlight)
☆25Jun 8, 2025Updated last year
CryptoAILab / FigStep
View on GitHub
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆212Jun 26, 2025Updated last year
UCSC-VLAA / vllm-safety-benchmark
View on GitHub
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
☆89Nov 28, 2023Updated 2 years ago
VILA-Lab / M-Attack
View on GitHub
[NeurIPS25 & ICML25 Workshop on Reliable and Responsible Foundation Models] A Simple Baseline Achieving Over 90% Success Rate Against the…
☆100Feb 3, 2026Updated 5 months ago
facebookresearch / wasp
View on GitHub
Official implementation of the WASP web agent security benchmark
☆98Apr 13, 2026Updated 3 months ago
Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
View on GitHub
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
☆281May 13, 2024Updated 2 years ago
ys-zong / VLGuard
View on GitHub
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
☆90Jan 19, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
sail-sg / Agent-Smith
View on GitHub
[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
☆123Mar 26, 2024Updated 2 years ago
SheltonLiu-N / AutoDAN
View on GitHub
[ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…
☆453Jan 22, 2025Updated last year
HanxunH / XTransferBench
View on GitHub
[ICML 2025] X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
☆48Feb 3, 2026Updated 5 months ago
guardagent / code
View on GitHub
☆47Dec 9, 2025Updated 7 months ago
locuslab / acr-memorization
View on GitHub
☆41Dec 19, 2024Updated last year
huanranchen / VLMTransfer
View on GitHub
A package that achieves 95%+ transfer attack success rate against GPT-4
☆26Oct 24, 2024Updated last year
ethz-spylab / agentdojo
View on GitHub
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆670Jun 2, 2026Updated last month
lapisrocks / rpo
View on GitHub
Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"
☆62Aug 8, 2024Updated last year
liu00222 / Open-Prompt-Injection
View on GitHub
This repository provides a benchmark for prompt injection attacks and defenses in LLMs
☆465Oct 29, 2025Updated 8 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
facebookresearch / SecAlign
View on GitHub
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
☆98Jul 2, 2026Updated 2 weeks ago
Norrrrrrr-lyn / WAInjectBench
View on GitHub
Benchmarking prompt injection detections for web agents.
☆20Jul 10, 2026Updated last week
liuxuannan / Awesome-Multimodal-Jailbreak
View on GitHub
A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Models
☆332Jan 11, 2026Updated 6 months ago
umd-huang-lab / VLM-Poisoning
View on GitHub
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆61Jan 15, 2025Updated last year
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆22Sep 16, 2025Updated 10 months ago
Zsbyqx20 / AgentHazard
View on GitHub
Mobile GUI Agents under Real-world Threats: Are We There Yet?
☆17May 18, 2026Updated 2 months ago
facebookresearch / advprompter
View on GitHub
Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873
☆183May 6, 2024Updated 2 years ago