TianduoWang / DPO-STLinks

[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
โ˜†44Updated 10 months ago

Alternatives and similar repositories for DPO-ST

Users that are interested in DPO-ST are comparing it to the libraries listed below

Sorting: