Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enhance the robustness of unlearned DMs against adversarial prompt attacks and achieves a better balance between unlearning performance and image generat…
☆49Nov 4, 2024Updated last year
Alternatives and similar repositories for AdvUnlearn
Users that are interested in AdvUnlearn are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 D&B Track] UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models by Yihua Zhang, Cho…☆82Nov 11, 2024Updated last year
- The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Uns…☆87Feb 28, 2025Updated last year
- [ECCV 2024] "Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers" (Official Implementation)☆44Mar 2, 2025Updated last year
- Code for the paper - ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning☆22Aug 13, 2024Updated last year
- [ECCV 2024] Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models☆84Oct 29, 2024Updated last year
- NeurIPS 2024 - Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation☆17Dec 5, 2024Updated last year
- ☆16Feb 23, 2025Updated last year
- [ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…☆142May 27, 2025Updated 9 months ago
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆29Oct 1, 2024Updated last year
- [ICML 2024] Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts (Official Pytorch Implementati…☆52Jan 11, 2026Updated last month
- Unified Concept Editing in Diffusion Models☆184Dec 7, 2025Updated 2 months ago
- Official repo for NeurIPS'24 paper "WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models"☆18Dec 16, 2024Updated last year
- ☆38Jan 15, 2025Updated last year
- A collection of resources on attacks and defenses targeting text-to-image diffusion models☆92Dec 20, 2025Updated 2 months ago
- [MM '24] EvilEdit: Backdooring Text-to-Image Diffusion Models in One Second☆28Nov 19, 2024Updated last year
- [NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning"☆42Oct 3, 2025Updated 4 months ago
- Towards Memorization-Free Diffusion Models (CVPR2024) Codebase☆12Jun 2, 2024Updated last year
- EraseDiff: Erasing Data Influence in Diffusion Models☆14Nov 20, 2024Updated last year
- ☆22Sep 28, 2023Updated 2 years ago
- [CVPR 2025] Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models☆16Jan 8, 2026Updated last month
- Responsible Visual Editing☆15Jul 10, 2024Updated last year
- [NeurIPS23 (Spotlight)] "Model Sparsity Can Simplify Machine Unlearning" by Jinghan Jia*, Jiancheng Liu*, Parikshit Ram, Yuguang Yao, Gao…☆84Updated this week
- (NeurIPS 2024)Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models☆15Jul 18, 2025Updated 7 months ago
- [NeurIPS 2024] Source code for our paper "Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models".☆13Jul 18, 2025Updated 7 months ago
- A toolkit for optimizing machine learning models for practical applications☆31Mar 6, 2025Updated 11 months ago
- ☆35May 22, 2024Updated last year
- ☆41Jun 1, 2023Updated 2 years ago
- Erasing Concepts from Diffusion Models☆656Aug 18, 2025Updated 6 months ago
- Official PyTorch Implementation☆17Dec 3, 2022Updated 3 years ago
- Official Code for ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users (NeurIPS 2024)☆23Oct 23, 2024Updated last year
- Code of paper [CVPR'24: Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?]☆23Apr 2, 2024Updated last year
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆53Jan 22, 2025Updated last year
- An innovative method designed to augment the capabilities of existing video diffusion models☆22May 10, 2024Updated last year
- Official repo for An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization☆16Mar 8, 2024Updated last year
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆21Oct 1, 2022Updated 3 years ago
- ☆22Apr 23, 2024Updated last year
- ☆20Oct 5, 2023Updated 2 years ago
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆23Jul 30, 2025Updated 7 months ago
- Generalized Data-free Universal Adversarial Perturbations in PyTorch☆20Oct 9, 2020Updated 5 years ago