natolambert / rlhf-book
Textbook on reinforcement learning from human feedback
☆795Updated this week
Alternatives and similar repositories for rlhf-book:
Users that are interested in rlhf-book are comparing it to the libraries listed below
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆419Updated 2 weeks ago
- Verifiers for LLM Reinforcement Learning☆827Updated 3 weeks ago
- Recipes to scale inference-time compute of open models☆1,058Updated 2 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆882Updated last week
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,184Updated last week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,076Updated 3 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆991Updated last month
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,143Updated 3 months ago
- System 2 Reasoning Link Collection☆826Updated last month
- LIMO: Less is More for Reasoning☆920Updated 2 weeks ago
- Synthetic data curation for post-training and structured data extraction☆1,257Updated this week
- A bibliography and survey of the papers surrounding o1☆1,187Updated 5 months ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,246Updated 2 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆484Updated 2 weeks ago
- procedural reasoning datasets☆571Updated this week
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆509Updated last month
- ☆1,017Updated 4 months ago
- Code for BLT research paper☆1,513Updated last week
- Build your own visual reasoning model☆341Updated this week
- This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.☆675Updated last week
- OLMoE: Open Mixture-of-Experts Language Models☆716Updated last month
- Pretraining code for a large-scale depth-recurrent language model☆745Updated last week
- Minimalistic large language model 3D-parallelism training☆1,808Updated this week
- Automatic evals for LLMs☆373Updated this week
- Minimal and annotated implementations of key ideas from modern deep learning research.☆270Updated this week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆2,048Updated 2 weeks ago
- ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning☆714Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,438Updated last week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,462Updated this week
- Awesome Reasoning LLM Tutorial/Survey/Guide☆1,436Updated 2 weeks ago