natolambert / rlhf-bookLinks
Textbook on reinforcement learning from human feedback
☆1,313Updated this week
Alternatives and similar repositories for rlhf-book
Users that are interested in rlhf-book are comparing it to the libraries listed below
Sorting:
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,670Updated 7 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,892Updated 2 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆559Updated last month
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,931Updated last month
- Recipes to scale inference-time compute of open models☆1,118Updated 5 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,222Updated last week
- Minimalistic large language model 3D-parallelism training☆2,323Updated 2 months ago
- Environments for LLM Reinforcement Learning☆3,495Updated this week
- Synthetic data curation for post-training and structured data extraction☆1,553Updated 3 months ago
- Awesome Reasoning LLM Tutorial/Survey/Guide☆2,151Updated last month
- ☆2,432Updated 2 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,339Updated 3 months ago
- Minimal and annotated implementations of key ideas from modern deep learning research.☆1,204Updated last month
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,126Updated this week
- Post-training with Tinker☆1,932Updated last week
- System 2 Reasoning Link Collection☆855Updated 8 months ago
- Large Concept Models: Language modeling in a sentence representation space☆2,304Updated 9 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,155Updated 2 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆572Updated 3 months ago
- Best practices & guides on how to write distributed pytorch training code☆536Updated 3 weeks ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,464Updated 5 months ago
- ☆907Updated 2 weeks ago
- Code for BLT research paper☆2,008Updated 2 weeks ago
- This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov☆1,982Updated 5 months ago
- [COLM 2025] LIMO: Less is More for Reasoning☆1,046Updated 3 months ago
- A bibliography and survey of the papers surrounding o1☆1,209Updated last year
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,166Updated 9 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆824Updated 3 months ago
- DataComp for Language Models☆1,386Updated 2 months ago
- ☆396Updated 10 months ago