junkangwu / beta-DPOLinks
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
☆50Updated last year
Alternatives and similar repositories for beta-DPO
Users that are interested in beta-DPO are comparing it to the libraries listed below
Sorting:
- A Sober Look at Language Model Reasoning☆92Updated last month
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆64Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization