vicgalle / refined-dpo

Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
11Updated 7 months ago

Related projects: