jackfsuia / nanoRLHF
View external linksLinks

RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.
79Feb 19, 2025Updated 11 months ago

Alternatives and similar repositories for nanoRLHF

Users that are interested in nanoRLHF are comparing it to the libraries listed below

Sorting:

Are these results useful?