tpai / gandalf-prompt-injection-writeupLinks

A writeup for the Gandalf prompt injection game.

☆36

Alternatives and similar repositories for gandalf-prompt-injection-writeup

Users that are interested in gandalf-prompt-injection-writeup are comparing it to the libraries listed below

Sorting:

briland / LLM-security-and-privacy
LLM security and privacy
☆48Updated 8 months ago
gdalmau / lakera-gandalf-solutions
My inputs for the LLM Gandalf made by Lakera
☆43Updated last year
sunblaze-ucb / cybergym
CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…
☆30Updated last week
hwchase17 / adversarial-prompts
Curation of prompts that are known to be adversarial to large language models
☆179Updated 2 years ago
vinusankars / BEAST
Implementation of BEAST adversarial attack for language models (ICML 2024)
☆88Updated last year
mik0w / pallms
Payloads for Attacking Large Language Models
☆90Updated 3 weeks ago
dropbox / llm-security
Dropbox LLM Security research code and results
☆227Updated last year
Reapor-Yurnero / imprompter
Codebase of https://arxiv.org/abs/2410.14923
☆48Updated 8 months ago
NickNameInvalid / LLM_CTF
☆65Updated 5 months ago
dreadnode / research
General research for Dreadnode
☆23Updated last year
sshh12 / llm_backdoor
Experimental tools to backdoor large language models by re-writing their system prompts at a raw parameter level. This allows you to pote…
☆172Updated 2 months ago
controllability / jailbreak-evaluation
The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.
☆23Updated 7 months ago
Trust4AI / ASTRAL
Automated Safety Testing of Large Language Models
☆15Updated 4 months ago
lakeraai / pint-benchmark
A benchmark for prompt injection detection systems.
☆120Updated last month
lve-org / lve
A repository of Language Model Vulnerabilities and Exposures (LVEs).
☆112Updated last year
sherdencooper / GPTFuzz
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
☆501Updated 9 months ago
leondz / lm_risk_cards
Risks and targets for assessing LLMs & LLM vulnerabilities
☆30Updated last year
NYU-LLM-CTF / NYU_CTF_Bench
☆56Updated last month
Valhall-ai / prompt-injection-mitigations
A collection of prompt injection mitigation techniques.
☆22Updated last year
alexalbertt / jailbreakchat
Code for the website www.jailbreakchat.com
☆96Updated last year
uiuc-kang-lab / cve-bench
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
☆61Updated last week
corca-ai / LLMFuzzAgent
[Corca / ML] Automatically solved Gandalf AI with LLM
☆50Updated last year
dreadnode / parley
Tree of Attacks (TAP) Jailbreaking Implementation
☆110Updated last year
AIM-Intelligence / Automated-Multi-Turn-Jailbreaks
☆74Updated 7 months ago
pasquini-dario / project_mantis
Project Mantis: Hacking Back the AI-Hacker; Prompt Injection as a Defense Against LLM-driven Cyberattacks
☆69Updated last month
dsbowen / strong_reject
☆68Updated last month
agencyenterprise / PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…
☆381Updated last year
tml-epfl / llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
☆319Updated 5 months ago
sinanw / llm-security-prompt-injection
This project investigates the security of large language models by performing binary classification of a set of input prompts to discover…
☆40Updated last year
RapidResponseBench / rapidresponsebench
☆34Updated 7 months ago