casiatao / LPO

The official pytorch implementation of “Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization”.
12Updated 2 weeks ago

Alternatives and similar repositories for LPO:

Users that are interested in LPO are comparing it to the libraries listed below