Triang-jyed-driung / RWKV-LM-RLHF-DPO

Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.
11Updated 6 months ago

Related projects: