EleutherAI / cookbookLinks

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

☆808

Alternatives and similar repositories for cookbook

Users that are interested in cookbook are comparing it to the libraries listed below

Sorting:

srush / LLM-Training-Puzzles
What would you do with 1000 H100s...
☆1,068Updated last year
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆460Updated 5 months ago
rwitten / HighPerfLLMs2024
☆516Updated last year
srush / Transformer-Puzzles
Puzzles for exploring transformers
☆355Updated 2 years ago
huggingface / picotron
Minimalistic 4D-parallelism distributed training framework for education purpose
☆1,619Updated 3 weeks ago
open-thought / system-2-research
System 2 Reasoning Link Collection
☆848Updated 4 months ago
carlini / yet-another-applied-llm-benchmark
A benchmark to evaluate language models on questions I've previously asked them to solve.
☆1,023Updated 3 months ago
srush / Autodiff-Puzzles
☆443Updated 9 months ago
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆160Updated last month
Quentin-Anthony / torch-profiling-tutorial
☆441Updated 2 weeks ago
HazyResearch / aisys-building-blocks
Building blocks for foundation models.
☆519Updated last year
callummcdougall / ARENA_3.0
☆634Updated this week
srush / awesome-o1
A bibliography and survey of the papers surrounding o1
☆1,207Updated 8 months ago
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆267Updated 2 months ago
huggingface / nanotron
Minimalistic large language model 3D-parallelism training
☆2,068Updated 3 weeks ago
mlfoundations / open_lm
A repository for research on medium sized language models.
☆505Updated last month
open-thought / reasoning-gym
procedural reasoning datasets
☆998Updated this week
clu0 / unet.cu
UNet diffusion model in pure CUDA
☆612Updated last year
callummcdougall / ARENA_2.0
Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.
☆219Updated last year
microsoft / Samba
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆899Updated 3 months ago
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆188Updated 2 months ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆487Updated last year
predibase / llm_distillation_playbook
Best practices for distilling large language models.
☆568Updated last year
srush / Triton-Puzzles
Puzzles for learning Triton
☆1,801Updated 8 months ago
EleutherAI / sparsify
Sparsify transformers with SAEs and transcoders
☆595Updated this week
EurekaLabsAI / tensor
The Tensor (or Array)
☆441Updated 11 months ago
EurekaLabsAI / mlp
The Multilayer Perceptron Language Model
☆556Updated 11 months ago
Laz4rz / GPT-2
Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆172Updated last year
NousResearch / atropos
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …
☆568Updated this week
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,110Updated 2 months ago