TrustAI-laboratory / Many-Shot-Jailbreaking-DemoLinks

Research on "Many-Shot Jailbreaking" in Large Language Models (LLMs). It unveils a novel technique capable of bypassing the safety mechanisms of LLMs, including those developed by Anthropic and other leading AI organizations. Resources

☆11

Alternatives and similar repositories for Many-Shot-Jailbreaking-Demo

Users that are interested in Many-Shot-Jailbreaking-Demo are comparing it to the libraries listed below

Sorting:

TrustAIRLab / JailbreakLLMs
A dataset consists of 6,387 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 666 jailbreak prompts).
☆12Updated last year
theshi-1128 / jailbreak-bench
The most comprehensive and accurate LLM jailbreak attack benchmark by far
☆19Updated 3 months ago
aaFrostnova / Papillon
[Usenix Security 2025] Official repo of paper PAPILLON: Efficient and Stealthy Fuzz Testing-Powered Jailbreaks for LLMs
☆14Updated last month
tml-epfl / llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
☆320Updated 5 months ago
DanielIzquierdo / WT901BLECL
Process the data from this bluetooth module in python from PC
☆10Updated 6 years ago
vinusankars / BEAST
Implementation of BEAST adversarial attack for language models (ICML 2024)
☆88Updated last year
royweiss1 / GPT_Keylogger
This is the official repository for the code used in the paper: "What Was Your Prompt? A Remote Keylogging Attack on AI Assistants", USEN…
☆52Updated 5 months ago
microsoft / BIPIA
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆72Updated last year
ssloxford / biometric-backdoors
Code for "Biometric Backdoors: A Poisoning Attack Against Unsupervised Template Updating"
☆11Updated 3 years ago
WUSTL-CSPL / LLMJailbreak
☆34Updated 9 months ago
Sensente / Security-Attacks-on-LCCTs
Security Attacks on LLM-based Code Completion Tools (AAAI 2025)
☆20Updated 2 months ago
qizhangli / Gradient-based-Jailbreak-Attacks
Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs
☆11Updated 8 months ago
JailbreakBench / artifacts
Jailbreak artifacts for JailbreakBench
☆60Updated 8 months ago
ebagdasa / multimodal_injection
☆91Updated last year
liu00222 / Open-Prompt-Injection
This repository provides a benchmark for prompt Injection attacks and defenses
☆245Updated last month
xirui-li / DrAttack
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
☆53Updated 10 months ago
facebookresearch / radioactive-watermark
Code for the paper "Watermarking Makes Language Models Radioactive"
☆17Updated 8 months ago
JailbreakBench / jailbreakbench
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
☆372Updated 3 months ago
XHMY / AutoDefense
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
☆51Updated last month
RICommunity / TAP
TAP: An automated jailbreaking method for black-box LLMs
☆176Updated 7 months ago
OxMarco / Micro-GSM-Network
Basic 2G sms and voice calls with a LimeNET Micro v2.1 and the osmocom nitb stack
☆11Updated last year
ThuCCSLab / JailbreakEval
[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.
☆162Updated 3 months ago
Yuchen413 / text2image_safety
☆186Updated 3 months ago
ZiyueWang25 / llm-security-challenge
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the Over…
☆13Updated last year
ThuCCSLab / FigStep
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆156Updated 3 weeks ago
ANG13T / nRFi-Monitor
A 2.4GHz band and WiFi analyzer toolkit made with the D1 Mini and NRF24L01
☆38Updated 2 years ago
verazuo / prompt-stealing-attack
[USENIX'24] Prompt Stealing Attacks Against Text-to-Image Generation Models
☆39Updated 6 months ago
patrickrchao / JailbreakingLLMs
☆573Updated 2 weeks ago
yueliu1999 / Awesome-Jailbreak-on-LLMs
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, data…
☆792Updated last week
wfhstudio / RFID-Cloner-OLED
☆11Updated last year