Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation"
☆23May 8, 2023Updated 2 years ago
Alternatives and similar repositories for Targeted-Data-Extraction
Users that are interested in Targeted-Data-Extraction are comparing it to the libraries listed below
Sorting:
- ☆39May 19, 2023Updated 2 years ago
- This project explores training data extraction attacks on the LLaMa 7B, GPT-2XL, and GPT-2-IMDB models to discover memorized content usin…☆15Jun 15, 2023Updated 2 years ago
- The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word pred…☆104Aug 13, 2024Updated last year
- Training data extraction on GPT-2☆197Feb 4, 2023Updated 3 years ago
- ☆13Oct 20, 2022Updated 3 years ago
- Code for our NeurIPS 2023 paper Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly☆14Jan 22, 2024Updated 2 years ago
- Python package for measuring memorization in LLMs.☆184Jul 16, 2025Updated 8 months ago
- VAE+GAN☆10Apr 18, 2018Updated 7 years ago
- ☆14May 8, 2024Updated last year
- Repo for EmbedLLM: Learning Compact Representations of Large Language Models☆29Sep 25, 2025Updated 5 months ago
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)☆15Jul 9, 2023Updated 2 years ago
- ☆12Jul 18, 2023Updated 2 years ago
- ☆35Oct 23, 2025Updated 4 months ago
- ☆15Feb 21, 2024Updated 2 years ago
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆50Dec 20, 2023Updated 2 years ago
- ☆72Feb 16, 2025Updated last year
- In this repository, we summary a paper list of works in conversational recommendation system and its related areas.☆15Sep 19, 2023Updated 2 years ago
- for DTCA model☆10Oct 17, 2023Updated 2 years ago
- 清华大学2019计网联合实验第一组☆28Jan 15, 2020Updated 6 years ago
- ☆29Aug 31, 2025Updated 6 months ago
- Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"☆110Dec 28, 2022Updated 3 years ago
- Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”☆18Jan 27, 2026Updated last month
- ☆21Jan 15, 2026Updated 2 months ago
- Official implementation for "Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis"☆13May 31, 2024Updated last year
- Official implementation for KDD25 paper "GraphLoRA: Structure-Aware Contrastive Low-Rank Adaptation for Cross-Graph Transfer Learning"☆21Jul 10, 2025Updated 8 months ago
- [ACL 2025] LongSafety: Evaluating Long-Context Safety of Large Language Models☆16Jun 18, 2025Updated 9 months ago
- [ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs☆17May 21, 2025Updated 10 months ago
- ☆42May 23, 2023Updated 2 years ago
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆19Jan 25, 2025Updated last year
- [COLM2025] "Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors"☆55Oct 6, 2025Updated 5 months ago
- Code for the NeurIPS 2024 submission: "DAGER: Extracting Text from Gradients with Language Model Priors"☆20Aug 13, 2025Updated 7 months ago
- Web version of the MiniDecaf compiler.☆13Sep 17, 2020Updated 5 years ago
- Code for Improving Task-free Continual Learning by Distributionally Robust Memory Evolution (ICML 2022)☆11Aug 20, 2022Updated 3 years ago
- Data and code for EMNLP 2023 industry-track paper "Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-W…☆30Jan 5, 2024Updated 2 years ago
- Federated Learning - PyTorch☆15Jun 27, 2021Updated 4 years ago
- ☆18Sep 5, 2024Updated last year
- ☆11Oct 5, 2024Updated last year
- ☆12Sep 30, 2022Updated 3 years ago
- Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs☆12Nov 7, 2024Updated last year