uclaml / SPPO

The official implementation of Self-Play Preference Optimization (SPPO)
471Updated last week

Alternatives and similar repositories for SPPO:

Users that are interested in SPPO are comparing it to the libraries listed below