sail-sg / closer-look-LLM-unlearning
The official code of the paper "A Closer Look at Machine Unlearning for Large Language Models".
☆20Updated last month
Alternatives and similar repositories for closer-look-LLM-unlearning:
Users that are interested in closer-look-LLM-unlearning are comparing it to the libraries listed below
- ☆44Updated 6 months ago
- This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)☆32Updated 2 months ago
- ☆20Updated 6 months ago
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆17Updated 3 months ago
- Official repo for NeurIPS'24 paper "WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models"☆11Updated last month
- Official code for the paper: Evaluating Copyright Takedown Methods for Language Models☆16Updated 6 months ago
- "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning" by Chongyu Fan*, Jiancheng Liu*, Licong Lin*, Jingh…☆21Updated this week
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆63Updated 6 months ago
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆65Updated 3 months ago
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆53Updated this week
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆83Updated 4 months ago
- SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors☆36Updated 6 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆53Updated 3 months ago