princeton-nlp / unintentional-unalignment
View external linksLinks

[ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
32Jan 7, 2026Updated last month

Alternatives and similar repositories for unintentional-unalignment

Users that are interested in unintentional-unalignment are comparing it to the libraries listed below

Sorting:

Are these results useful?