natolambert / rlhf-bookLinks
Textbook on reinforcement learning from human feedback
☆1,416Updated this week
Alternatives and similar repositories for rlhf-book
Users that are interested in rlhf-book are comparing it to the libraries listed below
Sorting:
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆581Updated 3 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,747Updated 9 months ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,033Updated last month
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,307Updated this week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,991Updated 4 months ago
- Recipes to scale inference-time compute of open models☆1,125Updated 8 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,274Updated last week
- Awesome Reasoning LLM Tutorial/Survey/Guide☆2,250Updated 3 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,463Updated 5 months ago
- Our library for RL environments + evals☆3,748Updated this week
- A bibliography and survey of the papers surrounding o1☆1,214Updated last year
- A reading list on LLM based Synthetic Data Generation 🔥☆1,510Updated 7 months ago
- System 2 Reasoning Link Collection☆867Updated 10 months ago
- ☆409Updated last year
- Minimal and annotated implementations of key ideas from modern deep learning research.☆1,219Updated 3 months ago
- Minimalistic large language model 3D-parallelism training☆2,422Updated last month
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,193Updated 4 months ago
- Synthetic data curation for post-training and structured data extraction☆1,602Updated 2 weeks ago
- ☆2,546Updated last week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆829Updated 5 months ago
- Best practices & guides on how to write distributed pytorch training code☆569Updated 2 months ago
- [COLM 2025] LIMO: Less is More for Reasoning☆1,062Updated 5 months ago
- Post-training with Tinker☆2,756Updated this week
- ☆1,032Updated last year
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆584Updated 5 months ago
- Async RL Training at Scale☆1,005Updated this week
- PyTorch building blocks for the OLMo ecosystem☆706Updated this week
- A project to improve skills of large language models☆767Updated this week
- Bringing BERT into modernity via both architecture changes and scaling☆1,614Updated 6 months ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,456Updated this week