skywalker023 / confaide
🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory"
☆42Updated last year
Alternatives and similar repositories for confaide
Users that are interested in confaide are comparing it to the libraries listed below
Sorting:
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆81Updated 8 months ago
- ☆42Updated 3 months ago
- ☆38Updated last year
- ☆13Updated 2 years ago
- Restore safety in fine-tuned language models through task arithmetic☆28Updated last year
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆35Updated 11 months ago
- ☆36Updated 7 months ago
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)☆14Updated last year
- ☆21Updated 2 months ago
- ☆35Updated last year
- [EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156☆32Updated last year
- [NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models☆17Updated 9 months ago
- Official code implementation of SKU, Accepted by ACL 2024 Findings☆14Updated 4 months ago
- Official Repository for Dataset Inference for LLMs☆33Updated 9 months ago
- ☆26Updated last year
- ☆54Updated 2 years ago
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆74Updated 7 months ago
- ☆21Updated last year
- [ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer