user1342 / Awesome-LLM-Red-TeamingLinks
A curated list of awesome LLM Red Teaming training, resources, and tools.
☆17Updated 3 months ago
Alternatives and similar repositories for Awesome-LLM-Red-Teaming
Users that are interested in Awesome-LLM-Red-Teaming are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systems☆39Updated last month
- Codebase of https://arxiv.org/abs/2410.14923☆48Updated 8 months ago
- Prompt Injection Attacks against GPT-4, Gemini, Azure, Azure with Jailbreak☆23Updated 8 months ago
- ☆34Updated 7 months ago
- A prompt injection game to collect data for robust ML research☆62Updated 5 months ago
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆30Updated this week
- LLM security and privacy☆48Updated 8 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆112Updated last year
- 🤖🛡️🔍🔒🔑 Tiny package designed to support red teams and penetration testers in exploiting large language model AI solutions.☆23Updated last year
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆56Updated 3 months ago
- ☆74Updated 7 months ago
- AgentFence is an open-source platform for automatically testing AI agent security. It identifies vulnerabilities such as prompt injection…☆15Updated 3 months ago
- This project investigates the security of large language models by performing binary classification of a set of input prompts to discover…☆40Updated last year
- Risks and targets for assessing LLMs & LLM vulnerabilities☆30Updated last year
- Universal Robustness Evaluation Toolkit (for Evasion)☆31Updated last month
- Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the Over…☆13Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆88Updated last year
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆51Updated 2 months ago
- A collection of prompt injection mitigation techniques.☆22Updated last year
- This repository provides a benchmark for prompt Injection attacks and defenses☆232Updated 3 weeks ago
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆82Updated 4 months ago
- OllaDeck is a purple technology stack for Generative AI (text modality) cybersecurity. It provides a comprehensive set of tools for both …☆18Updated 9 months ago
- Code for the website www.jailbreakchat.com☆96Updated last year
- ☆135Updated last month
- [SPOILER ALERT] Solutions to Gandalf, the prompt hacking/red teaming game from Lakera AI☆25Updated last year
- A collection of agents that use Large Language Models (LLMs) to perform tasks common on our day to day jobs in cyber security.☆130Updated last year
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆69Updated last year
- A Completely Modular LLM Reverse Engineering, Red Teaming, and Vulnerability Research Framework.☆46Updated 7 months ago
- ☆116Updated 2 weeks ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆53Updated 3 months ago