karpathy / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆3,806Updated 5 months ago
Alternatives and similar repositories for build-nanogpt:
Users that are interested in build-nanogpt are comparing it to the libraries listed below
- nanoGPT style version of Llama 3.1☆1,300Updated 5 months ago
- NanoGPT (124M) in 3 minutes☆2,152Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,346Updated 6 months ago
- The n-gram Language Model☆1,370Updated 5 months ago
- llama3 implementation one matrix multiplication at a time☆14,074Updated 8 months ago
- ☆4,054Updated 7 months ago
- The Multilayer Perceptron Language Model☆533Updated 5 months ago
- PyTorch native post-training library☆4,765Updated this week
- A PyTorch native library for large model training☆3,200Updated this week
- The official PyTorch implementation of Google's Gemma models☆5,341Updated 3 weeks ago
- ☆2,811Updated 4 months ago
- ☆3,716Updated 11 months ago
- The Autograd Engine☆555Updated 4 months ago
- LLM training in simple, raw C/CUDA☆25,158Updated 3 months ago
- DataComp for Language Models☆1,209Updated last month
- Examples in the MLX framework☆6,691Updated this week
- LLM101n: Let's build a Storyteller☆31,136Updated 5 months ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,767Updated last month
- Material for gpu-mode lectures☆3,567Updated 3 weeks ago
- An autoregressive character-level language model for making more things☆2,723Updated 7 months ago
- Tools for merging pretrained large language models.☆5,157Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆11,290Updated this week
- Implementation for MatMul-free LM.☆2,948Updated 2 months ago
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,171Updated 2 weeks ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,479Updated this week
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,409Updated this week
- Curated list of datasets and tools for post-training.☆2,560Updated 2 weeks ago
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,091Updated this week
- Go ahead and axolotl questions☆8,395Updated this week
- Modeling, training, eval, and inference code for OLMo☆5,059Updated this week