natolambert / rlhf-bookLinks
Textbook on reinforcement learning from human feedback
☆1,344Updated last week
Alternatives and similar repositories for rlhf-book
Users that are interested in rlhf-book are comparing it to the libraries listed below
Sorting:
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,984Updated last week
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆565Updated 2 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,262Updated 3 weeks ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,687Updated 7 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,911Updated 3 months ago
- Recipes to scale inference-time compute of open models☆1,119Updated 6 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,185Updated this week
- Minimalistic large language model 3D-parallelism training☆2,362Updated 3 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,382Updated 3 months ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,484Updated 6 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,168Updated 3 months ago
- Synthetic data curation for post-training and structured data extraction☆1,572Updated 4 months ago
- Environments for LLM Reinforcement Learning☆3,603Updated this week
- Awesome Reasoning LLM Tutorial/Survey/Guide☆2,207Updated last month
- A bibliography and survey of the papers surrounding o1☆1,214Updated last year
- System 2 Reasoning Link Collection☆861Updated 8 months ago
- Best practices & guides on how to write distributed pytorch training code☆546Updated last month
- Post-training with Tinker☆2,313Updated last week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆829Updated 4 months ago
- ☆402Updated 11 months ago
- Code for BLT research paper☆2,013Updated last month
- DataComp for Language Models☆1,398Updated 3 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆576Updated 4 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,209Updated 2 weeks ago
- Minimal and annotated implementations of key ideas from modern deep learning research.☆1,207Updated 2 months ago
- [COLM 2025] LIMO: Less is More for Reasoning☆1,054Updated 4 months ago
- ☆2,477Updated last month
- PyTorch building blocks for the OLMo ecosystem☆519Updated this week
- An interface library for RL post training with environments.☆829Updated this week
- Async RL Training at Scale☆909Updated this week