sugarandgugu / Simple-Trl-Training

基于DPO算法微调语言大模型,简单好上手。
28Updated 4 months ago

Related projects

Alternatives and complementary repositories for Simple-Trl-Training