natolambert / rlhf-book
Textbook on reinforcement learning from human feedback
☆505Updated this week
Alternatives and similar repositories for rlhf-book:
Users that are interested in rlhf-book are comparing it to the libraries listed below
- Understanding R1-Zero-Like Training: A Critical Perspective☆725Updated this week
- procedural reasoning datasets☆541Updated this week
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆605Updated last week
- Best practices & guides on how to write distributed pytorch training code☆383Updated last month
- System 2 Reasoning Link Collection☆818Updated 2 weeks ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆782Updated 3 weeks ago
- Verifiers for LLM Reinforcement Learning☆727Updated last week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆970Updated 3 weeks ago
- A bibliography and survey of the papers surrounding o1☆1,183Updated 4 months ago
- Recipes to scale inference-time compute of open models☆1,048Updated last month
- LLM Analytics☆648Updated 5 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,015Updated 2 months ago
- Friends of OLMo and their links.☆272Updated 3 months ago
- Testing baseline LLMs performance across various models☆244Updated last week
- Automatic evals for LLMs☆346Updated this week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆426Updated 6 months ago
- PyTorch building blocks for the OLMo ecosystem☆177Updated this week
- Build your own visual reasoning model☆320Updated last week
- Pretraining code for a large-scale depth-recurrent language model☆709Updated 2 weeks ago
- LIMO: Less is More for Reasoning☆875Updated last month
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆300Updated 5 months ago
- ☆1,011Updated 3 months ago
- An example starter repo for Python projects☆247Updated last week
- Synthetic data curation for post-training and structured data extraction☆1,097Updated last week
- A reading list on LLM based Synthetic Data Generation 🔥☆1,223Updated last month
- ☆504Updated 4 months ago
- Open weights language model from Google DeepMind, based on Griffin.☆629Updated last month
- An extension of the nanoGPT repository for training small MOE models.☆109Updated 3 weeks ago
- Sparsify transformers with SAEs and transcoders☆499Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,358Updated this week