ash80 / RLHF_in_notebooks
View external linksLinks

RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks
231Jun 20, 2025Updated 7 months ago

Alternatives and similar repositories for RLHF_in_notebooks

Users that are interested in RLHF_in_notebooks are comparing it to the libraries listed below

Sorting:

Are these results useful?