RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.
634Updated last week

Related projects: