natolambert / rlhf-book
Textbook on reinforcement learning from human feedback
☆76Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for rlhf-book
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆58Updated 3 months ago
- ☆101Updated 3 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆93Updated 3 months ago
- A puzzle to learn about prompting☆121Updated last year
- code for training & evaluating Contextual Document Embedding models☆117Updated last week
- Minimal but scalable implementation of large language models in JAX☆26Updated 2 weeks ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 11 months ago
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆43Updated last month
- Experiments for efforts to train a new and improved t5☆76Updated 7 months ago
- Cost aware hyperparameter tuning algorithm☆124Updated 4 months ago
- Can Language Models Solve Olympiad Programming?☆101Updated 3 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated this week
- ☆112Updated last month
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- ☆68Updated 3 months ago
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- ☆90Updated 4 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆113Updated 7 months ago
- A toolkit for scaling law research ⚖☆43Updated 8 months ago
- Supercharge huggingface transformers with model parallelism.☆75Updated last month
- Automatic Evals for Instruction-Tuned Models☆45Updated this week
- LLM training in simple, raw C/CUDA☆12Updated last month
- A set of Python scripts that makes your experience on TPU better☆40Updated 4 months ago
- ☆40Updated 6 months ago
- An introduction to LLM Sampling☆64Updated 2 weeks ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆128Updated last month
- ☆87Updated 9 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆87Updated 3 months ago
- ☆55Updated last month
- ☆45Updated 2 months ago