zwhong714 / PSFTLinks

PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, constraining policy drift to stabilize training and improve generalization.
30Updated 2 months ago

Alternatives and similar repositories for PSFT

Users that are interested in PSFT are comparing it to the libraries listed below

Sorting: