PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. π Best Paper Awards @ NeurIPS ML Safety Workshop 2022
β496Apr 27, 2026Updated last month
Alternatives and similar repositories for PromptInject
Users that are interested in PromptInject are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Promptsβ586Feb 27, 2026Updated 3 months ago
- New ways of breaking app-integrated LLMsβ2,098Jul 17, 2025Updated 10 months ago
- This repository provides a benchmark for prompt injection attacks and defenses in LLMsβ456Oct 29, 2025Updated 7 months ago
- Universal and Transferable Attacks on Aligned Language Modelsβ4,690Aug 2, 2024Updated last year
- A curation of awesome tools, documents and projects about LLM Security.β1,608Aug 20, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Mβ¦β445Jan 22, 2025Updated last year
- β740Jul 2, 2025Updated 11 months ago
- π§ LLMFuzzer - Fuzzing Framework for Large Language Models π§ LLMFuzzer is the first open-source fuzzing framework specifically designed β¦β349Feb 12, 2024Updated 2 years ago
- Curation of prompts that are known to be adversarial to large language modelsβ191Feb 12, 2023Updated 3 years ago
- LLM Prompt Injection Detectorβ1,499Aug 7, 2024Updated last year
- a security scanner for custom LLM applicationsβ1,207Dec 1, 2025Updated 6 months ago
- HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusalβ976Aug 16, 2024Updated last year
- Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignmentβ111Mar 8, 2024Updated 2 years ago
- β‘ Vigil β‘ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputsβ479Jan 31, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Explore, Establish, Exploit: Red Teaming Language Models from Scratchβ15Jun 21, 2023Updated 2 years ago
- Dropbox LLM Security research code and resultsβ258May 21, 2024Updated 2 years ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]β385Jan 23, 2025Updated last year
- β60Mar 9, 2023Updated 3 years ago
- We jailbreak GPT-3.5 Turboβs safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20β¦β351Feb 23, 2024Updated 2 years ago
- Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"β40Jul 8, 2024Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)β88May 14, 2024Updated 2 years ago
- β201Nov 26, 2023Updated 2 years ago
- The Security Toolkit for LLM Interactionsβ3,042Dec 15, 2025Updated 5 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Promptingβ21Mar 25, 2024Updated 2 years ago
- Make your GenAI Apps Safe & Secure Test & harden your system promptβ684Feb 16, 2026Updated 3 months ago
- Repository for "StrongREJECT for Empty Jailbreaks" paperβ157Nov 3, 2024Updated last year
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".β71Oct 23, 2024Updated last year
- autoredteam: code for training models that automatically red team other language modelsβ16Aug 9, 2023Updated 2 years ago
- The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engβ¦β3,946Updated this week
- Papers and resources related to the security and privacy of LLMs π€β579Jun 8, 2025Updated last year
- Risks and targets for assessing LLMs & LLM vulnerabilitiesβ35May 27, 2024Updated 2 years ago
- β100Oct 15, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]β605Apr 4, 2025Updated last year
- TAP: An automated jailbreaking method for black-box LLMsβ236Dec 10, 2024Updated last year
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"β1,837Jun 17, 2025Updated 11 months ago
- LLM prompt attacks for hacker CTFs via CTFd.β14Dec 17, 2023Updated 2 years ago
- A framework to evaluate the generalization capability of safety alignment for LLMsβ628Oct 9, 2025Updated 8 months ago
- Every practical and proposed defense against prompt injection.β699Feb 22, 2025Updated last year
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873β181May 6, 2024Updated 2 years ago