KellerJordan / modded-nanogpt
NanoGPT (124M) in 3 minutes
☆2,479Updated 2 weeks ago
Alternatives and similar repositories for modded-nanogpt:
Users that are interested in modded-nanogpt are comparing it to the libraries listed below
- nanoGPT style version of Llama 3.1☆1,351Updated 8 months ago
- A PyTorch native library for large model training☆3,587Updated this week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆987Updated last month
- Minimalistic large language model 3D-parallelism training☆1,786Updated this week
- Code for BLT research paper☆1,443Updated last week
- Puzzles for learning Triton☆1,566Updated 4 months ago
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,375Updated 3 months ago
- Efficient Triton Kernels for LLM Training☆4,836Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,051Updated 2 months ago
- Tile primitives for speedy kernels☆2,251Updated this week
- Everything about the SmolLM2 and SmolVLM family of models☆2,177Updated 2 weeks ago
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,510Updated last week
- UNet diffusion model in pure CUDA☆601Updated 9 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆6,699Updated this week
- 🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton☆2,250Updated this week
- System 2 Reasoning Link Collection☆825Updated last month
- AllenAI's post-training codebase☆2,898Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,414Updated this week
- Entropy Based Sampling and Parallel CoT Decoding☆3,349Updated 5 months ago
- Recipes to scale inference-time compute of open models☆1,051Updated last month
- Simple RL training for reasoning☆3,435Updated last week
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …☆1,323Updated this week
- A library for mechanistic interpretability of GPT-style language models☆2,061Updated this week
- PyTorch native quantization and sparsity for training and inference☆1,954Updated this week
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,347Updated last week
- ☆1,014Updated 4 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,407Updated 3 weeks ago
- Democratizing Reinforcement Learning for LLMs☆2,976Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,629Updated this week
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆860Updated last month