li-plus / nanoRLHFView on GitHub
Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)
17May 23, 2024Updated last year

Alternatives and similar repositories for nanoRLHF

Users that are interested in nanoRLHF are comparing it to the libraries listed below

Sorting:

Are these results useful?