inspire-group / DP-RandP
[NeurIPS 2023] Differentially Private Image Classification by Learning Priors from Random Processes
☆12Updated last year
Alternatives and similar repositories for DP-RandP:
Users that are interested in DP-RandP are comparing it to the libraries listed below
- ☆11Updated 2 years ago
- The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …☆18Updated last year
- Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)☆25Updated 2 months ago
- Code for Arxiv When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?☆18Updated last month
- Github repo for One-shot Neural Backdoor Erasing via Adversarial Weight Masking (NeurIPS 2022)☆14Updated 2 years ago
- Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"☆56Updated last year
- ☆53Updated last year
- [ICML 2023] Are Diffusion Models Vulnerable to Membership Inference Attacks?☆32Updated 4 months ago
- Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)☆34Updated last year
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆45Updated 8 months ago
- Identification of the Adversary from a Single Adversarial Example (ICML 2023)☆9Updated 6 months ago
- Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"☆19Updated last year
- ☆16Updated 8 months ago
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"☆21Updated 3 weeks ago
- ☆25Updated 7 months ago
- [CCS-LAMPS'24] LLM IP Protection Against Model Merging☆11Updated 3 months ago
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆32Updated 2 months ago
- [ICLR2023] Distilling Cognitive Backdoor Patterns within an Image☆32Updated 2 months ago
- ☆20Updated 6 months ago
- This is an official repository for Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study (ICCV2023…☆20Updated last year
- ☆20Updated 11 months ago
- Repository for the Paper: Refusing Safe Prompts for Multi-modal Large Language Models☆12Updated 3 months ago
- ☆28Updated 7 months ago
- ☆21Updated 3 months ago
- SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors☆36Updated 6 months ago
- Distribution Preserving Backdoor Attack in Self-supervised Learning☆14Updated 11 months ago
- ☆12Updated 4 months ago
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆17Updated 3 months ago
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"☆36Updated this week
- Camouflage poisoning via machine unlearning☆16Updated 2 years ago