eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)
2,024Updated last month

Related projects: