chrisliu298 / awesome-llm-unlearningLinks
A resource repository for machine unlearning in large language models
☆523Updated 2 weeks ago
Alternatives and similar repositories for awesome-llm-unlearning
Users that are interested in awesome-llm-unlearning are comparing it to the libraries listed below
Sorting:
- [NeurIPS D&B '25] The one-stop repository for LLM unlearning☆466Updated 3 weeks ago
- A survey on harmful fine-tuning attack for large language model☆230Updated last week
- LLM Unlearning☆180Updated 2 years ago
- UP-TO-DATE LLM Watermark paper. 🔥🔥🔥☆369Updated last year
- ☆182Updated 2 months ago
- Python package for measuring memorization in LLMs.☆179Updated 6 months ago
- Toolkit for evaluating the trustworthiness of generative foundation models.☆125Updated 5 months ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆169Updated 8 months ago
- A resource repository for representation engineering in large language models☆146Updated last year
- ☆28Updated last month
- The lastest paper about detection of LLM-generated text and code☆282Updated 7 months ago
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM Computing Surveys, 2025.☆644Updated last week
- We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20…☆335Updated last year
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆290Updated last month
- Accepted by ECCV 2024☆181Updated last year
- Accepted by IJCAI-24 Survey Track☆227Updated last year
- A toolkit to assess data privacy in LLMs (under development)☆67Updated last year
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆47Updated 3 months ago
- ☆167Updated 2 months ago
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆89Updated 9 months ago
- Papers and resources related to the security and privacy of LLMs 🤖☆558Updated 7 months ago
- ☆71Updated last year
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆174Updated last year
- awesome SAE papers☆71Updated 7 months ago
- This is the official code for the paper "Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable".☆27Updated 10 months ago
- A curated list of resources for activation engineering☆122Updated 3 months ago
- The code for paper "The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)", exploring the privacy risk o…☆64Updated 11 months ago
- ☆55Updated last year
- A survey of privacy problems in Large Language Models (LLMs). Contains summary of the corresponding paper along with relevant code☆68Updated last year
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆467Updated last week