tpai / gandalf-prompt-injection-writeup
A writeup for the Gandalf prompt injection game.
☆36Updated last year
Related projects ⓘ
Alternatives and complementary repositories for gandalf-prompt-injection-writeup
- My inputs for the LLM Gandalf made by Lakera☆36Updated last year
- The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.☆19Updated 2 weeks ago
- Payloads for Attacking Large Language Models☆64Updated 4 months ago
- Turning Gandalf against itself. Use LLMs to automate playing Lakera Gandalf challenge without needing to set up an account with a platfor…☆25Updated last year
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]☆220Updated 2 months ago
- A benchmark for prompt injection detection systems.☆87Updated 2 months ago
- [Corca / ML] Automatically solved Gandalf AI with LLM☆46Updated last year
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆47Updated 7 months ago
- Dropbox LLM Security research code and results☆217Updated 6 months ago
- ☆62Updated last month
- Code for the website www.jailbreakchat.com☆74Updated last year
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆107Updated 8 months ago
- Codebase of https://arxiv.org/abs/2410.14923☆30Updated last month
- A collection of prompt injection mitigation techniques.☆18Updated last year
- ☆36Updated this week
- This repository provides implementation to formalize and benchmark Prompt Injection attacks and defenses☆146Updated 2 months ago
- General research for Dreadnode☆17Updated 5 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆313Updated 8 months ago
- using ML models for red teaming☆39Updated last year
- ☆63Updated this week
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆405Updated last month
- Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!☆259Updated last month
- LLM security and privacy☆41Updated last month
- 🤖🛡️🔍🔒🔑 Tiny package designed to support red teams and penetration testers in exploiting large language model AI solutions.☆16Updated 6 months ago
- JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]☆236Updated last month
- A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).☆121Updated 10 months ago
- ☆17Updated last week
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- Tools and our test data developed for the HackAPrompt 2023 competition☆29Updated last year
- ☆26Updated this week