mireshghallah / ft-memorizationLinks

☆13

Alternatives and similar repositories for ft-memorization

Users that are interested in ft-memorization are comparing it to the libraries listed below

Sorting:

amazon-science / controlling-llm-memorization
☆38Updated 2 years ago
pratyushmaini / llm_dataset_inference
Official Repository for Dataset Inference for LLMs
☆41Updated last year
Princeton-SysML / kNNLM_privacy
Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888
☆36Updated last year
xiangyue9607 / Sentence-LDP
Code for the WWW'23 paper "Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy"
☆12Updated 2 years ago
thu-coai / Targeted-Data-Extraction
Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…
☆23Updated 2 years ago
Vaidehi99 / InfoDeletionAttacks
☆46Updated 8 months ago
skywalker023 / confaide
🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…
☆46Updated last year
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆30Updated 3 years ago
parameterlab / mia-scaling
Source code of NAACL 2025 Findings "Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models"
☆14Updated 8 months ago
AlexWan0 / Poisoning-Instruction-Tuned-Models
☆56Updated last year
snw2021 / LLM_Unlearning_Papers
☆26Updated last year
mireshghallah / neighborhood-curvature-mia
☆23Updated 2 years ago
declare-lab / resta
Restore safety in fine-tuned language models through task arithmetic
☆29Updated last year
SALT-NLP / Efficient_Unlearning
☆38Updated 2 years ago
yihuaihong / ConceptVectors
[EMNLP 2025 Main] ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"
☆35Updated 2 months ago
joeljang / knowledge-unlearning
[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models
☆83Updated last year
weichen-yu / LM-Extraction
☆43Updated 2 years ago
jeffhj / LM_PersonalInfoLeak
The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)
☆24Updated 2 years ago
wyshi / lm_privacy
☆21Updated 4 years ago
ejones313 / auditing-llms
☆58Updated 2 years ago
centerforaisafety / tdc2023-starter-kit
This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.
☆89Updated last year
RockyLzy / TextDefender
codes for "Searching for an Effective Defender:Benchmarking Defense against Adversarial Word Substitution"
☆31Updated 2 years ago
zjysteven / mink-plus-plus
[ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMs
☆45Updated 5 months ago
VITA-Group / DP-OPT
[ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
☆46Updated last year
yaojin17 / Unlearning_LLM
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
☆60Updated last year
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆81Updated 10 months ago
jthickstun / watermark
Code for watermarking language models
☆82Updated last year
princeton-nlp / corpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156
☆40Updated last year
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆83Updated 7 months ago
tatsu-lab / linguistic_calibration
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆27Updated last year