licong-lin/negative-preference-optimization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/licong-lin/negative-preference-optimization)

licong-lin / negative-preference-optimization

☆76

Alternatives and similar repositories for negative-preference-optimization

Users that are interested in negative-preference-optimization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jaechan-repo / muse_bench
View on GitHub
☆33Aug 9, 2024Updated last year
zhliu0106 / learning-to-refuse
View on GitHub
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
☆10Dec 13, 2024Updated last year
locuslab / open-unlearning
View on GitHub
[NeurIPS D&B '25] The one-stop repository for LLM unlearning
☆568Mar 18, 2026Updated 4 months ago
Lslland / T-Vaccine
View on GitHub
☆19Jun 21, 2025Updated last year
OPTML-Group / SOUL
View on GitHub
Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"
☆30Oct 1, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ethz-spylab / unlearning-vs-safety
View on GitHub
☆27Oct 6, 2024Updated last year
chrisliu298 / awesome-llm-unlearning
View on GitHub
A resource repository for machine unlearning in large language models
☆614Jul 14, 2026Updated last week
kevinyaobytedance / llm_unlearn
View on GitHub
LLM Unlearning
☆185Oct 20, 2023Updated 2 years ago
OPTML-Group / Unlearn-Smooth
View on GitHub
[ICML25] Official repo for "Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond…
☆24Sep 27, 2025Updated 9 months ago
OPTML-Group / WAGLE
View on GitHub
Official repo for NeurIPS'24 paper "WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models"
☆19Dec 16, 2024Updated last year
boyiwei / CoTaEval
View on GitHub
[NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models
☆17Jul 17, 2024Updated 2 years ago
franciscoliu / Awesome-GenAI-Unlearning
View on GitHub
☆188Apr 22, 2026Updated 2 months ago
jinzhuoran / RWKU
View on GitHub
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆100Sep 30, 2024Updated last year
git-disl / Vaccine
View on GitHub
This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)
☆51Jan 15, 2026Updated 6 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yaojin17 / Unlearning_LLM
View on GitHub
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
☆68Sep 30, 2024Updated last year
EnnengYang / Efficient-WEMoE
View on GitHub
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Arxiv, 2024.
☆16Oct 28, 2024Updated last year
HelloEveryboby / Butler
View on GitHub
Butler 是一个用于自动化服务管理和任务调度的工具项目。
☆17Updated this week
sail-sg / closer-look-LLM-unlearning
View on GitHub
[ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models
☆49Dec 4, 2024Updated last year
centerforaisafety / wmdp
View on GitHub
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…
☆176May 29, 2025Updated last year
UCSB-NLP-Chang / ULD
View on GitHub
Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference' [NeurIPS'24…
☆26Jun 14, 2024Updated 2 years ago
arobey1 / advbench
View on GitHub
☆45Mar 3, 2023Updated 3 years ago
swj0419 / muse_bench
View on GitHub
☆34Mar 13, 2025Updated last year
princeton-nlp / benign-data-breaks-safety
View on GitHub
☆47Oct 1, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
houseme / sensitive-rs
View on GitHub
Sensitive-rs is a Rust library for finding, validating, filtering, and replacing sensitive words. It provides efficient algorithms to han…
☆26Updated this week
NJUPT-SAST / aurora-ui
View on GitHub
🌏 UI component library for the future, based on WebComponent.
☆23Nov 12, 2024Updated last year
KID-22 / LLM-Unlearning-Paper-List
View on GitHub
☆28Dec 18, 2025Updated 7 months ago
UCSC-REAL / FLAT
View on GitHub
[ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data
☆14Feb 26, 2025Updated last year
franciscoliu / SKU
View on GitHub
Official code implementation of SKU, Accepted by ACL 2024 Findings
☆20Dec 18, 2024Updated last year
OPTML-Group / Unlearn-Saliency
View on GitHub
[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…
☆149Feb 28, 2026Updated 4 months ago
AmourWaltz / UAlign
View on GitHub
Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"
☆15Mar 25, 2025Updated last year
MartinPawelczyk / In-Context-Unlearning
View on GitHub
"In-Context Unlearning: Language Models as Few Shot Unlearners". Martin Pawelczyk, Seth Neel* and Himabindu Lakkaraju*; ICML 2024.
☆30Oct 18, 2023Updated 2 years ago
aengusl / latent-adversarial-training
View on GitHub
☆48Sep 29, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
QwenLM / online_merging_optimizers
View on GitHub
Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
☆82Jun 19, 2024Updated 2 years ago
FranxYao / Retrieval-Head-with-Flash-Attention
View on GitHub
Efficient retrieval head analysis with triton flash attention that supports topK probability
☆13Jun 15, 2024Updated 2 years ago
llm-misinformation / llm-misinformation
View on GitHub
The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"
☆85Jul 11, 2026Updated last week
Ajim63 / Attention_GNN_GFJ
View on GitHub
Python Implementation of the Paper "Attention based dynamic graph neural network for asset pricing" -Published in Global Finance Journal
☆14Oct 11, 2023Updated 2 years ago
LLM-Tuning-Safety / LLMs-Finetuning-Safety
View on GitHub
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20…
☆358Feb 23, 2024Updated 2 years ago
1andrevich / antifilter-domain
View on GitHub
Generated geosite.dat based on Antifilter Community List
☆29Updated this week
xqlin98 / APOHF
View on GitHub
Prompt Optimization with Human Feedback
☆18Aug 7, 2024Updated last year