OpenMOSE / RWKV-LM-RLHF

Reinforcement Learning Toolkit for RWKV. Distillation,SFT,RLHF(DPO,ORPO), infinite context training, Aligning Let's boost the model's intelligence! currently under construction:)
18Updated this week

Related projects

Alternatives and complementary repositories for RWKV-LM-RLHF