snu-mllab / DPPO

Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)
41Updated 5 months ago

Alternatives and similar repositories for DPPO:

Users that are interested in DPPO are comparing it to the libraries listed below