MASTERKEY is a framework designed to explore and exploit vulnerabilities in large language model chatbots by automating jailbreak attacks and evaluating their defenses.
☆33Sep 12, 2024Updated last year
Alternatives and similar repositories for MasterKey
Users that are interested in MasterKey are comparing it to the libraries listed below
Sorting:
- Red Queen Dataset and data generation template☆26Dec 26, 2025Updated 2 months ago
- A novel jailbreak attack unveiling an overlooked attack surface inherently in the chain-of-thought reasoning trajectory of LLMs☆22Sep 18, 2025Updated 5 months ago
- [USENIX Security 2024] Official Repository of 'KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-…☆14Aug 6, 2025Updated 6 months ago
- TAP: An automated jailbreaking method for black-box LLMs☆221Dec 10, 2024Updated last year
- A lightweight library for large laguage model (LLM) jailbreaking defense.☆61Sep 11, 2025Updated 5 months ago
- ☆31Oct 10, 2023Updated 2 years ago
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆39Jan 17, 2025Updated last year
- ☆37Sep 30, 2024Updated last year
- This is the code of paper: Robust Mid-Pass Filtering Graph Convolutional Networks.(paper accepted by WWW2023)☆13Feb 17, 2023Updated 3 years ago
- ☆11Nov 8, 2023Updated 2 years ago
- ☆47Nov 17, 2022Updated 3 years ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆98May 23, 2024Updated last year
- PhishDecloaker: Detecting CAPTCHA-cloaked Phishing Websites via Hybrid Vision-based Interactive Models☆14Jan 3, 2025Updated last year
- ☆30Oct 21, 2025Updated 4 months ago
- Temporal-Dynamics Aware Adversarial Attacks on Discrete Time Dynamic Graph Models☆17Oct 19, 2024Updated last year
- The implementatioin code of paper: “A Practical Clean-Label Backdoor Attack with Limited Information in Vertical Federated Learning”☆11Jul 1, 2023Updated 2 years ago
- ☆11Jul 19, 2022Updated 3 years ago
- Code for the ICLR 2024 paper "How Realistic Is Your Synthetic Data? Constraining Deep Generative Models for Tabular Data"☆18Apr 15, 2025Updated 10 months ago
- ☆11Dec 19, 2023Updated 2 years ago
- [CVPR2025] Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification☆15Nov 9, 2025Updated 3 months ago
- The A2C Reinforcement Learning Algorithm in Pytorch☆16May 13, 2024Updated last year
- CVE-2017-13156-Janus复现☆12Sep 7, 2020Updated 5 years ago
- An official PyTorch implementation of "Certifiably Robust Graph Contrastive Learning" (NeurIPS 2023)☆11Jan 22, 2024Updated 2 years ago
- Reverse Engineering Imperceptible Backdoor Attacks on Deep Neural Networks for Detection and Training Set Cleansing☆14Feb 18, 2021Updated 5 years ago
- ☆14Oct 6, 2024Updated last year
- ☆14Feb 24, 2026Updated last week
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆19Sep 18, 2025Updated 5 months ago
- Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks☆40Feb 11, 2026Updated 2 weeks ago
- ☆12May 27, 2022Updated 3 years ago
- ☆18May 19, 2025Updated 9 months ago
- ☆12Dec 1, 2024Updated last year
- ☆698Jul 2, 2025Updated 8 months ago
- ☆13Oct 30, 2024Updated last year
- [NDSS'25] The official implementation of safety misalignment.☆17Jan 8, 2025Updated last year
- [NIPS2025] A decentralized, RAG-enhanced multi-agent framework for LLMs with dynamic task routing and agent evolution.☆35Oct 2, 2025Updated 5 months ago
- GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment☆18Sep 17, 2024Updated last year
- A Clone-Based Approach for Recommending Modification on Pasted Code☆12Jun 10, 2017Updated 8 years ago
- QT/C++ 计算器☆14Feb 21, 2020Updated 6 years ago
- On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory☆10Aug 19, 2021Updated 4 years ago