A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β24,464Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β59,144Nov 12, 2025Updated 6 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β161,309Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,412Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β31,168Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ16,121Aug 8, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,741Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,229Sep 30, 2025Updated 8 months ago
- Inference Llama 2 in one file of pure Cβ19,577Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,890Aug 14, 2024Updated last year
- Google Researchβ38,023May 29, 2026Updated last week
- LLM training in simple, raw C/CUDAβ30,087Jun 26, 2025Updated 11 months ago
- Making large AI models cheaper, faster and more accessibleβ41,385May 25, 2026Updated last week
- Neural Networks: Zero to Heroβ22,911Aug 18, 2024Updated last year
- Fast and memory-efficient exact attentionβ24,037Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,139Jan 23, 2026Updated 4 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,248Jul 17, 2024Updated last year
- Train transformer language models with reinforcement learning.β18,547Updated this week
- Inference code for Llama modelsβ59,438Jan 26, 2025Updated last year
- LLM inference in C/C++β114,217Updated this week
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ32,921Updated this week
- A library for efficient similarity search and clustering of dense vectors.β40,202Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ81,909Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,501Updated this week
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,479May 1, 2026Updated last month
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,523Jul 1, 2024Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,775Updated this week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ33,689Mar 25, 2026Updated 2 months ago
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β36,850Updated this week
- Ongoing research training transformer models at scaleβ16,519Updated this week
- Development repository for the Triton language and compilerβ19,313Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ100,309Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,410Aug 17, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The agent engineering platform.β138,156Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β42,709May 30, 2026Updated last week
- LlamaIndex is the leading document agent and OCR platformβ49,909Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β42,815Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,711Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,226Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,550Updated this week