jeffhj / LM_PersonalInfoLeakLinks
The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)
☆24Updated 2 years ago
Alternatives and similar repositories for LM_PersonalInfoLeak
Users that are interested in LM_PersonalInfoLeak are comparing it to the libraries listed below
Sorting:
- Training data extraction on GPT-2☆190Updated 2 years ago
- ☆13Updated 2 years ago
- Official Repository for Dataset Inference for LLMs☆36Updated last year
- A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020☆36Updated 3 years ago
- Code for the WWW'23 paper "Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy"☆12Updated 2 years ago
- Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022☆30Updated 3 years ago
- ☆19Updated 3 years ago
- ☆55Updated 2 years ago
- ☆9Updated 4 years ago
- ☆36Updated 2 years ago
- ☆57Updated last year
- Source code of NAACL 2025 Findings "Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models"☆12Updated 6 months ago
- ☆6Updated 2 years ago
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆42Updated last year
- ☆44Updated 6 months ago
- A Synthetic Dataset for Personal Attribute Inference (NeurIPS'24 D&B)☆43Updated last week
- The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word pred…☆100Updated 11 months ago
- Code for paper: "Spinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures"☆22Updated 3 years ago
- Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"☆107Updated 2 years ago
- ☆26Updated 4 years ago
- ☆44Updated 2 years ago
- ☆75Updated 3 years ago
- Python package for measuring memorization in LLMs.☆161Updated 3 weeks ago
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆90Updated last year
- Code for the paper "Weight Poisoning Attacks on Pre-trained Models" (ACL 2020)☆142Updated 3 years ago
- Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples (EACL 2021)☆8Updated 4 years ago
- ☆293Updated this week
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆82Updated 10 months ago
- ☆41Updated 11 months ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆106Updated 5 months ago