wagner-group / MarkMyWords
☆29Updated 11 months ago
Alternatives and similar repositories for MarkMyWords
Users that are interested in MarkMyWords are comparing it to the libraries listed below
Sorting:
- Code for watermarking language models☆78Updated 8 months ago
- Official Implementation of ICLR 2022 paper, ``Adversarial Unlearning of Backdoors via Implicit Hypergradient''☆53Updated 2 years ago
- Repository for Towards Codable Watermarking for Large Language Models☆36Updated last year
- Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"☆57Updated 2 years ago
- ☆24Updated 2 months ago
- ☆53Updated last year
- Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)☆36Updated last year
- Camouflage poisoning via machine unlearning☆17Updated 2 years ago
- Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)☆26Updated 5 months ago
- [ICML 2023] Are Diffusion Models Vulnerable to Membership Inference Attacks?☆34Updated 8 months ago
- The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …☆18Updated 2 years ago
- ☆25Updated 2 years ago
- [NeurIPS 2023] Differentially Private Image Classification by Learning Priors from Random Processes☆12Updated last year
- [CCS-LAMPS'24] LLM IP Protection Against Model Merging☆14Updated 7 months ago
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"☆29Updated 4 months ago
- ☆20Updated 5 months ago
- Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"☆23Updated last year
- RAB: Provable Robustness Against Backdoor Attacks☆39Updated last year
- [NeurIPS23 (Spotlight)] "Model Sparsity Can Simplify Machine Unlearning" by Jinghan Jia*, Jiancheng Liu*, Parikshit Ram, Yuguang Yao, Gao…☆67Updated last year
- [ICLR'21] Dataset Inference for Ownership Resolution in Machine Learning☆32Updated 2 years ago
- Official implementation of "RelaxLoss: Defending Membership Inference Attacks without Losing Utility" (ICLR 2022)☆49Updated 2 years ago
- Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models☆19Updated last month
- Code for identifying natural backdoors in existing image datasets.☆15Updated 2 years ago
- Certified Removal from Machine Learning Models☆65Updated 3 years ago
- ☆25Updated 3 years ago
- code release for "Unrolling SGD: Understanding Factors Influencing Machine Unlearning" published at EuroS&P'22☆22Updated 3 years ago
- ☆73Updated 2 years ago
- ☆19Updated 7 months ago
- ☆19Updated last year
- ☆24Updated 2 years ago