microsoft / BIPIA
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆57Updated 10 months ago
Alternatives and similar repositories for BIPIA:
Users that are interested in BIPIA are comparing it to the libraries listed below
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆92Updated this week
- [NDSS'25 Poster] A collection of automated evaluators for assessing jailbreak attempts.☆108Updated 3 weeks ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆39Updated 3 months ago
- This repository provides implementation to formalize and benchmark Prompt Injection attacks and defenses☆171Updated 3 weeks ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆35Updated 3 weeks ago
- TAP: An automated jailbreaking method for black-box LLMs☆144Updated 2 months ago
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆43Updated 2 months ago
- ☆50Updated 7 months ago
- Agent Security Bench (ASB)☆58Updated this week
- Papers about red teaming LLMs and Multimodal models.☆96Updated 2 months ago
- ☆90Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆79Updated 9 months ago
- ☆80Updated last year
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆49Updated 5 months ago
- Code to generate NeuralExecs (prompt injection for LLMs)☆19Updated 2 months ago
- [ACL24] Official Repo of Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`☆57Updated 2 months ago
- Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)☆115Updated 2 months ago
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆116Updated 6 months ago
- [AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts☆108Updated 2 months ago
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆139Updated 9 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆91Updated 3 weeks ago
- [ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability☆133Updated last month
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆58Updated last year
- ☆72Updated last week
- ☆18Updated 10 months ago
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models☆115Updated last week
- ☆50Updated last month
- The automated prompt injection framework for LLM-integrated applications.☆184Updated 5 months ago
- ☆10Updated 2 months ago
- A curated list of trustworthy Generative AI papers. Daily updating...☆68Updated 5 months ago