[ICML 2024] Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts (Official Pytorch Implementation)
☆52Jan 11, 2026Updated last month
Alternatives and similar repositories for P4D
Users that are interested in P4D are comparing it to the libraries listed below
Sorting:
- The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Uns…☆87Feb 28, 2025Updated last year
- ☆13Jan 14, 2026Updated last month
- Divide-and-Conquer Attack: Harnessing the Power of LLM to Bypass the Censorship of Text-to-Image Generation Mode☆18Feb 16, 2025Updated last year
- ☆38Jan 15, 2025Updated last year
- Official Implementation of implicit reference attack☆11Oct 16, 2024Updated last year
- ☆43Jun 1, 2023Updated 2 years ago
- ☆23Feb 5, 2026Updated last month
- A collection of resources on attacks and defenses targeting text-to-image diffusion models☆94Dec 20, 2025Updated 2 months ago
- ☆46Jul 14, 2024Updated last year
- Official Implementation of Safe Latent Diffusion for Text2Image☆94Apr 21, 2023Updated 2 years ago
- ☆197Apr 7, 2025Updated 11 months ago
- List of T2I safety papers, updated daily, welcome to discuss using Discussions☆67Aug 12, 2024Updated last year
- ☆19May 14, 2025Updated 9 months ago
- [CVPR2024] MMA-Diffusion: MultiModal Attack on Diffusion Models☆386Jan 8, 2026Updated 2 months ago
- ☆28May 28, 2023Updated 2 years ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆35Oct 23, 2024Updated last year
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆13Jun 20, 2023Updated 2 years ago
- Unified Concept Editing in Diffusion Models☆184Dec 7, 2025Updated 3 months ago
- Accepted by ECCV 2024☆192Oct 15, 2024Updated last year
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆19Oct 22, 2024Updated last year
- Official repository for "On the Multi-modal Vulnerability of Diffusion Models"☆16Jul 15, 2024Updated last year
- ☆14Mar 1, 2019Updated 7 years ago
- A paper summary of Backdoor Attack against Neural Network☆13Aug 9, 2019Updated 6 years ago
- Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"☆42Jul 8, 2024Updated last year
- Official Code for "Intelligent Painter: Picture Composition With Resampling Diffusion Model" (ICIP 2023)☆17Jun 23, 2023Updated 2 years ago
- [NeurIPS-2023] Annual Conference on Neural Information Processing Systems☆228Dec 22, 2024Updated last year
- A repository of resources on machine unlearning for diffusion models☆58Oct 9, 2025Updated 4 months ago
- The official code of IEEE S&P 2024 paper "Why Does Little Robustness Help? A Further Step Towards Understanding Adversarial Transferabili…☆20Aug 22, 2024Updated last year
- This dataset contains results from all rounds of Adversarial Nibbler. This data includes adversarial prompts fed into public generative t…☆25Feb 3, 2025Updated last year
- ☆22Sep 13, 2021Updated 4 years ago
- ☆24Jun 17, 2025Updated 8 months ago
- The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …☆20Apr 27, 2023Updated 2 years ago
- Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime☆54Sep 19, 2023Updated 2 years ago
- Adversarial Robustness, White-box, Adversarial Attack☆52Jul 6, 2022Updated 3 years ago
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆130Feb 19, 2025Updated last year
- ☆23Apr 10, 2023Updated 2 years ago
- Fluent student-teacher redteaming☆23Jul 25, 2024Updated last year
- [CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models☆138Jul 1, 2025Updated 8 months ago
- Erasing Concepts from Diffusion Models☆656Aug 18, 2025Updated 6 months ago