☆21Jul 26, 2025Updated 7 months ago
Alternatives and similar repositories for SQL-Injection-Jailbreak
Users that are interested in SQL-Injection-Jailbreak are comparing it to the libraries listed below
Sorting:
- ☆19May 14, 2025Updated 10 months ago
- ☆13Feb 21, 2025Updated last year
- ☆38Nov 16, 2025Updated 4 months ago
- [NeurIPS 2025] The official implementation of "T2SMark: Balancing Robustness and Diversity in Noise-as-Watermark for Diffusion Models"☆45Nov 2, 2025Updated 4 months ago
- [Neurips 2025]StegoZip: Enhancing Linguistic Steganography Payload in Practice with Large Language Models☆29Dec 4, 2025Updated 3 months ago
- The official repository for guided jailbreak benchmark☆29Jul 28, 2025Updated 7 months ago
- ☆52Feb 24, 2024Updated 2 years ago
- ☆26Sep 3, 2025Updated 6 months ago
- Welcome to the official repository for Siren, a project aimed at understanding and mitigating harmful behaviors in large language models …☆15Sep 12, 2025Updated 6 months ago
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆59Oct 1, 2025Updated 5 months ago
- STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation☆75Nov 11, 2025Updated 4 months ago
- Red Queen Dataset and data generation template☆27Dec 26, 2025Updated 2 months ago
- [ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models☆46Dec 4, 2024Updated last year
- ☆44Oct 19, 2025Updated 5 months ago
- ☆34Dec 2, 2023Updated 2 years ago
- [ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs via Flipping".☆168May 2, 2025Updated 10 months ago
- ☆11May 18, 2025Updated 10 months ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…☆23Jul 26, 2024Updated last year
- ☆124Feb 3, 2025Updated last year
- ☆40May 17, 2025Updated 10 months ago
- Provably Secure Steganography in Practice Based on “Distribution Copies”☆42Jun 1, 2025Updated 9 months ago
- Source code of NAACL 2025 Findings "Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models"☆15Dec 16, 2025Updated 3 months ago
- Official implementation of "Data Mixture Inference: What do BPE tokenizers reveal about their training data?"☆18May 15, 2025Updated 10 months ago
- This repository includes main notebook of the code for our proposed RCGAN☆12Apr 10, 2020Updated 5 years ago
- [ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions☆14Mar 7, 2026Updated last week
- Code Implementation of Adversarial Prompt Evaluation paper☆14Sep 18, 2025Updated 6 months ago
- [ACL 2025] The official implementation of the paper "PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free".☆63Dec 4, 2025Updated 3 months ago
- Adversarial Attack for Pre-trained Code Models☆10Jul 19, 2022Updated 3 years ago
- [COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…☆88May 9, 2025Updated 10 months ago
- An audio steganalysis method based on CNN in the time domain.☆12Feb 25, 2021Updated 5 years ago
- ☆13Jun 10, 2018Updated 7 years ago
- Provably Secure Steganography☆14Sep 13, 2025Updated 6 months ago
- [COLING 2025] Official repo of paper: "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jail…☆12Jul 26, 2024Updated last year
- [ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".☆37Aug 4, 2025Updated 7 months ago
- ☆21May 23, 2025Updated 9 months ago
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆19Oct 22, 2024Updated last year
- [NDSS'25] The official implementation of safety misalignment.☆17Jan 8, 2025Updated last year
- ☆25Apr 15, 2024Updated last year
- Effective Prompt Extraction from Language Models☆34Sep 10, 2024Updated last year