JIA-Lab-research / Step-DPOView on GitHub
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
391Jan 19, 2025Updated last year

Alternatives and similar repositories for Step-DPO

Users that are interested in Step-DPO are comparing it to the libraries listed below

Sorting:

Are these results useful?