Paul33333 / SFT-and-DPO

This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)
12Updated 2 months ago

Alternatives and similar repositories for SFT-and-DPO:

Users that are interested in SFT-and-DPO are comparing it to the libraries listed below