jackfsuia / nanoRLHFLinks
RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.
☆72Updated 6 months ago
Alternatives and similar repositories for nanoRLHF
Users that are interested in nanoRLHF are comparing it to the libraries listed below
Sorting: