A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β24,533Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β59,420Nov 12, 2025Updated 7 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β161,518Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,508Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β31,184Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ16,298Aug 8, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,786Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,233Sep 30, 2025Updated 8 months ago
- Inference Llama 2 in one file of pure Cβ19,631Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,930Aug 14, 2024Updated last year
- Google Researchβ38,126Updated this week
- LLM training in simple, raw C/CUDAβ30,209Jun 26, 2025Updated 11 months ago
- Making large AI models cheaper, faster and more accessibleβ41,395May 25, 2026Updated 3 weeks ago
- Neural Networks: Zero to Heroβ23,045Aug 18, 2024Updated last year
- Fast and memory-efficient exact attentionβ24,111Updated this week
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,147Jan 23, 2026Updated 4 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,248Jul 17, 2024Updated last year
- Train transformer language models with reinforcement learning.β18,613Updated this week
- Inference code for Llama modelsβ59,452Jan 26, 2025Updated last year
- LLM inference in C/C++β116,603Updated this week
- You like pytorch? You like micrograd? You love tinygrad! β€οΈ