KellerJordan / modded-nanogpt
NanoGPT (124M) in 3 minutes
☆2,403Updated this week
Alternatives and similar repositories for modded-nanogpt:
Users that are interested in modded-nanogpt are comparing it to the libraries listed below
- A PyTorch native library for large model training☆3,470Updated this week
- nanoGPT style version of Llama 3.1☆1,341Updated 7 months ago
- Minimalistic large language model 3D-parallelism training☆1,701Updated this week
- Code for BLT research paper☆1,436Updated last week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆935Updated 2 weeks ago
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,356Updated 2 months ago
- Tile primitives for speedy kernels☆2,153Updated this week
- The n-gram Language Model☆1,402Updated 7 months ago
- Puzzles for learning Triton☆1,508Updated 4 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆855Updated last month
- Schedule-Free Optimization in PyTorch☆2,116Updated 3 weeks ago
- The Multilayer Perceptron Language Model☆543Updated 7 months ago
- UNet diffusion model in pure CUDA☆600Updated 8 months ago
- The Autograd Engine☆581Updated 6 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆783Updated 2 weeks ago
- PyTorch native post-training library☆5,014Updated this week
- System 2 Reasoning Link Collection☆811Updated this week
- PyTorch native quantization and sparsity for training and inference☆1,913Updated this week
- Efficient Triton Kernels for LLM Training☆4,683Updated this week
- Video+code lecture on building nanoGPT from scratch☆3,989Updated 7 months ago
- Large Concept Models: Language modeling in a sentence representation space☆2,030Updated last month
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,299Updated 2 weeks ago
- Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors a…☆1,310Updated this week
- Everything about the SmolLM2 and SmolVLM family of models☆2,035Updated last week
- What would you do with 1000 H100s...☆1,016Updated last year
- The Tensor (or Array)☆427Updated 7 months ago
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,479Updated last month
- 4M: Massively Multimodal Masked Modeling☆1,696Updated 2 weeks ago
- AllenAI's post-training codebase☆2,804Updated this week
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,746Updated 3 months ago