thomfoster / minRLHF

A (somewhat) minimal library for finetuning language models with PPO on human feedback.
86Updated last year

Related projects

Alternatives and complementary repositories for minRLHF