Refinath / DPO_LLM_finetuningView on GitHub
Implementation of Direct Preference Optimization (DPO) for fine-tuning large language models, including training pipelines, dataset handling, and experiment configurations for preference-based model alignment.
30Jul 2, 2024Updated last year

Alternatives and similar repositories for DPO_LLM_finetuning

Users that are interested in DPO_LLM_finetuning are comparing it to the libraries listed below

Sorting:

Are these results useful?