princeton-nlp / unintentional-unalignment

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
12Updated 3 weeks ago

Related projects

Alternatives and complementary repositories for unintentional-unalignment