CharlesYu2000 / PCGU-UnlearningBias
☆15Updated last year
Alternatives and similar repositories for PCGU-UnlearningBias:
Users that are interested in PCGU-UnlearningBias are comparing it to the libraries listed below
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆79Updated 4 months ago
- ☆36Updated last year
- ☆20Updated 6 months ago
- ☆21Updated 3 months ago
- ☆44Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆61Updated 2 months ago
- ☆24Updated 3 months ago
- Unofficial re-implementation of "Trusting Your Evidence: Hallucinate Less with Context-aware Decoding"☆28Updated last month
- ☆118Updated last year
- ☆24Updated last year
- Official code implementation of SKU, Accepted by ACL 2024 Findings☆13Updated last month
- ☆45Updated 6 months ago
- ☆16Updated last year
- Official code for the paper: Evaluating Copyright Takedown Methods for Language Models☆16Updated 6 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆53Updated 3 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆57Updated 3 months ago
- ☆29Updated 8 months ago
- EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975☆37Updated last year
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆12Updated 10 months ago
- Landing Page for TOFU☆107Updated 3 weeks ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆63Updated 6 months ago
- [NeurIPS 2023 D&B Track] Code and data for paper "Revisiting Out-of-distribution Robustness in NLP: Benchmarks, Analysis, and LLMs Evalua…☆31Updated last year
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆128Updated last month
- Code for ACL 2023 paper "BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases".☆21Updated last year
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆83Updated 4 months ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆66Updated 2 years ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆57Updated last year
- [EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156☆28Updated last year
- ☆39Updated last year
- A resource repository for representation engineering in large language models☆93Updated 2 months ago