ethz-spylab / misleading-privacy-evalsLinks
Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)
☆10Updated last year
Alternatives and similar repositories for misleading-privacy-evals
Users that are interested in misleading-privacy-evals are comparing it to the libraries listed below
Sorting:
- [ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer☆46Updated last year
- ☆167Updated last month
- Python package for measuring memorization in LLMs.☆166Updated 2 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆60Updated 11 months ago
- [NeurIPS23 (Spotlight)] "Model Sparsity Can Simplify Machine Unlearning" by Jinghan Jia*, Jiancheng Liu*, Parikshit Ram, Yuguang Yao, Gao…☆80Updated last year
- Implementaiton of "DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation" (accepted by NAACL2024 Findings)".☆23Updated 7 months ago
- Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024 Spotlight]☆102Updated this week
- Official Repository for Dataset Inference for LLMs☆41Updated last year
- This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (…☆15Updated last year
- ☆23Updated 9 months ago
- ☆19Updated last year
- ☆27Updated last year
- Codebase for decoding compressed trust.☆24Updated last year
- ☆44Updated last year
- ☆27Updated 6 months ago
- Code and dataset for the paper: "Can Editing LLMs Inject Harm?"☆21Updated 10 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆70Updated 2 months ago
- ☆45Updated 7 months ago
- ☆57Updated 2 years ago
- A toolkit to assess data privacy in LLMs (under development)☆62Updated 8 months ago
- [AAAI, ICLR TP] Fast Machine Unlearning Without Retraining Through Selective Synaptic Dampening☆55Updated last year
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆112Updated 7 months ago
- Official repo for the paper: Recovering Private Text in Federated Learning of Language Models (in NeurIPS 2022)☆59Updated 2 years ago
- The collection of papers about Private Evolution☆17Updated 2 months ago
- ☆22Updated 2 years ago
- A survey on harmful fine-tuning attack for large language model☆208Updated this week
- [ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…☆130Updated 4 months ago
- Github repo for NeurIPS 2024 paper "Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models"☆20Updated last week
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆59Updated last year
- ☆60Updated last year