ethz-spylab/agentdojo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ethz-spylab/agentdojo)

ethz-spylab / agentdojo

A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

☆684

Alternatives and similar repositories for agentdojo

Users that are interested in agentdojo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

uiuc-kang-lab / InjecAgent
View on GitHub
☆153Jul 2, 2024Updated 2 years ago
agiresearch / ASB
View on GitHub
Agent Security Bench (ASB)
☆273Apr 16, 2026Updated 3 months ago
SaFo-Lab / AgentDyn
View on GitHub
The official implementation of the paper "AgentDyn: Are Your Agent Security Defenses Deployable in Real-World Dynamic Environments?"
☆68May 19, 2026Updated 2 months ago
facebookresearch / SecAlign
View on GitHub
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
☆98Jul 2, 2026Updated 3 weeks ago
thu-coai / Agent-SafetyBench
View on GitHub
☆149Aug 11, 2025Updated 11 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
liu00222 / Open-Prompt-Injection
View on GitHub
This repository provides a benchmark for prompt injection attacks and defenses in LLMs
☆467Oct 29, 2025Updated 8 months ago
google-research / camel-prompt-injection
View on GitHub
Code for the paper "Defeating Prompt Injections by Design"
☆357Jun 20, 2025Updated last year
kaijiezhu11 / MELON
View on GitHub
[ICML'25] MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents
☆37Jul 31, 2025Updated 11 months ago
facebookresearch / wasp
View on GitHub
Official implementation of the WASP web agent security benchmark
☆98Apr 13, 2026Updated 3 months ago
facebookresearch / Meta_SecAlign
View on GitHub
Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".
☆70Jun 11, 2026Updated last month
Sizhe-Chen / StruQ
View on GitHub
official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries
☆77Nov 10, 2025Updated 8 months ago
lancopku / agent-backdoor-attacks
View on GitHub
Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]
☆116Sep 27, 2024Updated last year
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆22Sep 16, 2025Updated 10 months ago
sunblaze-ucb / progent
View on GitHub
Progent: Securing AI Agents with Privilege Control
☆42May 14, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
uiuc-kang-lab / AdaptiveAttackAgent
View on GitHub
☆39Mar 12, 2025Updated last year
SaFo-Lab / DRIFT
View on GitHub
[NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…
☆58Jul 16, 2026Updated last week
microsoft / BIPIA
View on GitHub
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆148Apr 15, 2024Updated 2 years ago
sleeepeer / PIArena
View on GitHub
[ACL 2026] PIArena: A Platform for Prompt Injection Evaluation
☆41Apr 28, 2026Updated 2 months ago
MurrayTom / ToolSafe
View on GitHub
Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…
☆74Mar 25, 2026Updated 4 months ago
compsec-snu / pfi
View on GitHub
PFI: Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents
☆31Mar 26, 2025Updated last year
Sizhe-Chen / DefensiveToken
View on GitHub
Repo for the paper "Defending Against Prompt Injection With a Few DefensiveTokens"
☆19Nov 10, 2025Updated 8 months ago
guardagent / code
View on GitHub
☆47Dec 9, 2025Updated 7 months ago
invariantlabs-ai / invariant
View on GitHub
Guardrails for secure and robust agent development
☆436Jan 12, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CHATS-lab / ToolShield
View on GitHub
[ICML 2026] Official implementation for paper "Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Ag…
☆29Jul 6, 2026Updated 2 weeks ago
ChenWu98 / agent-attack
View on GitHub
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆140Feb 19, 2025Updated last year
AI-secure / AgentPoison
View on GitHub
[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"
☆231Jun 17, 2026Updated last month
facebookresearch / rl-injector
View on GitHub
Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks
☆53May 6, 2026Updated 2 months ago
Astarojth / AgentAuditor-ASSEBench
View on GitHub
☆40May 29, 2026Updated last month
RPC2 / AutoInject
View on GitHub
☆20Jun 12, 2026Updated last month
aisa-group / skill-inject
View on GitHub
Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks
☆88Jul 1, 2026Updated 3 weeks ago
sleeepeer / PISanitizer
View on GitHub
PISanitizer: Preventing Prompt Injection to Long-Context LLMs via Prompt Sanitization
☆18Dec 10, 2025Updated 7 months ago
AI-secure / RedCode
View on GitHub
[NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents
☆86Apr 24, 2026Updated 3 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
hwanchang00 / ChatInject
View on GitHub
[ICLR 2026] Official implementation of "ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents"
☆17Mar 23, 2026Updated 4 months ago
m4p1e / agent-sentinel
View on GitHub
AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents
☆35Aug 31, 2025Updated 10 months ago
ryoungj / ToolEmu
View on GitHub
[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
☆214Mar 22, 2024Updated 2 years ago
khhung-906 / Attention-Tracker
View on GitHub
Code for our NAACL2025 accepted paper: Attention Tracker: Detecting Prompt Injection Attacks in LLMs
☆28Sep 19, 2025Updated 10 months ago
JailbreakBench / jailbreakbench
View on GitHub
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
☆635Apr 4, 2025Updated last year
microsoft / TaskTracker
View on GitHub
TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…
☆92Sep 1, 2025Updated 10 months ago
SaFo-Lab / AGrail4Agent
View on GitHub
[ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".
☆42Aug 4, 2025Updated 11 months ago