tpai / gandalf-prompt-injection-writeup
A writeup for the Gandalf prompt injection game.
☆37Updated last year
Alternatives and similar repositories for gandalf-prompt-injection-writeup:
Users that are interested in gandalf-prompt-injection-writeup are comparing it to the libraries listed below
- My inputs for the LLM Gandalf made by Lakera☆41Updated last year
- Codebase of https://arxiv.org/abs/2410.14923☆44Updated 5 months ago
- A benchmark for prompt injection detection systems.☆99Updated last month
- [Corca / ML] Automatically solved Gandalf AI with LLM☆48Updated last year
- Payloads for Attacking Large Language Models☆77Updated 8 months ago
- Experimental tools to backdoor large language models by re-writing their system prompts at a raw parameter level. This allows you to pote…☆152Updated last month
- The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.☆22Updated 4 months ago
- A productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)☆91Updated 3 months ago
- Tree of Attacks (TAP) Jailbreaking Implementation☆105Updated last year
- Turning Gandalf against itself. Use LLMs to automate playing Lakera Gandalf challenge without needing to set up an account with a platfor…☆29Updated last year
- A prompt injection game to collect data for robust ML research☆55Updated 2 months ago
- ☆41Updated 4 months ago
- Curation of prompts that are known to be adversarial to large language models☆179Updated 2 years ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).