snwen123 / LLM_Unlearning_Papers
☆22Updated 9 months ago
Related projects: ⓘ
- ☆32Updated 11 months ago
- Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"☆56Updated 6 months ago
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models☆50Updated 2 months ago
- code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification☆26Updated 2 years ago
- Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)☆20Updated 2 months ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆59Updated last year
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆34Updated 4 months ago
- ☆32Updated 10 months ago
- ☆21Updated last year
- ☆14Updated 2 months ago
- ☆26Updated last year
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆67Updated last week
- Dataset and Code for Multimodal Fact Checking and Explanation Generation (Mocheg)☆35Updated 9 months ago
- Official code implementation of SKU, Accepted by ACL 2024 Findings☆11Updated 4 months ago
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆64Updated 2 weeks ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆54Updated 9 months ago
- Methods and evaluation for aligning language models temporally☆24Updated 6 months ago
- ☆23Updated last year
- ☆23Updated last year
- Unofficial re-implementation of "Trusting Your Evidence: Hallucinate Less with Context-aware Decoding"☆25Updated 10 months ago
- Restore safety in fine-tuned language models through task arithmetic☆25Updated 5 months ago
- ☆21Updated 6 months ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆96Updated last week
- Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models☆22Updated 11 months ago
- ☆42Updated 5 months ago
- ☆21Updated 2 months ago
- ☆24Updated 4 months ago
- Evaluating the Ripple Effects of Knowledge Editing in Language Models☆45Updated 5 months ago
- The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"☆45Updated last month
- Multilingual safety benchmark for Large Language Models☆21Updated 2 weeks ago