KellerJordan / modded-nanogptLinks
NanoGPT (124M) in 3 minutes
☆2,600Updated this week
Alternatives and similar repositories for modded-nanogpt
Users that are interested in modded-nanogpt are comparing it to the libraries listed below
Sorting:
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,505Updated 2 months ago
- A PyTorch native platform for training generative AI models☆3,838Updated this week
- nanoGPT style version of Llama 3.1☆1,373Updated 9 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,372Updated last month
- Minimalistic large language model 3D-parallelism training☆1,888Updated last week
- Code for BLT research paper☆1,664Updated last week
- PyTorch native post-training library☆5,217Updated this week
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,416Updated 4 months ago
- The n-gram Language Model☆1,421Updated 9 months ago
- The Autograd Engine☆607Updated 8 months ago
- The Multilayer Perceptron Language Model☆549Updated 9 months ago
- Tile primitives for speedy kernels☆2,399Updated this week
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆3,003Updated this week
- Recipes to scale inference-time compute of open models☆1,073Updated last week
- AllenAI's post-training codebase☆2,986Updated this week
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …☆1,350Updated this week
- Schedule-Free Optimization in PyTorch☆2,161Updated last week
- PyTorch native quantization and sparsity for training and inference☆2,064Updated this week
- System 2 Reasoning Link Collection☆834Updated 2 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,586Updated last week
- Puzzles for learning Triton☆1,658Updated 6 months ago
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,555Updated last week
- Everything about the SmolLM2 and SmolVLM family of models☆2,442Updated 2 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,300Updated last month
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆874Updated last month
- Democratizing Reinforcement Learning for LLMs☆3,291Updated 2 weeks ago
- UNet diffusion model in pure CUDA☆605Updated 11 months ago
- ☆4,083Updated 11 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆793Updated last month
- DataComp for Language Models☆1,300Updated 2 months ago