A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β24,428Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β58,753Nov 12, 2025Updated 6 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β160,794May 20, 2026Updated last week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,386Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β31,152Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ15,980Aug 8, 2024Updated last year
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,691Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,219Sep 30, 2025Updated 7 months ago
- Inference Llama 2 in one file of pure Cβ19,548Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,868Aug 14, 2024Updated last year
- Google Researchβ37,963Updated this week
- LLM training in simple, raw C/CUDAβ29,997Jun 26, 2025Updated 11 months ago
- Making large AI models cheaper, faster and more accessibleβ41,386May 18, 2026Updated last week
- Neural Networks: Zero to Heroβ22,683Aug 18, 2024Updated last year
- Fast and memory-efficient exact attentionβ23,917Updated this week
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,133Jan 23, 2026Updated 4 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,248Jul 17, 2024Updated last year
- Train transformer language models with reinforcement learning.β18,411May 19, 2026Updated last week
- Inference code for Llama modelsβ59,436Jan 26, 2025Updated last year
- LLM inference in C/C++β112,590Updated this week
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ32,753Updated this week
- A library for efficient similarity search and clustering of dense vectors.β40,132Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ80,418May 19, 2026Updated last week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,494Updated this week
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,484May 1, 2026Updated 3 weeks ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,496Jul 1, 2024Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,668May 20, 2026Updated last week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ33,514Mar 25, 2026Updated 2 months ago
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β36,825Updated this week
- Ongoing research training transformer models at scaleβ16,427Updated this week
- Development repository for the Triton language and compilerβ19,246Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ100,144Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,409Aug 17, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The agent engineering platform.β137,448Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β42,616Updated this week
- LlamaIndex is the leading document agent and OCR platformβ49,501May 15, 2026Updated last week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β42,634May 20, 2026Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,691May 19, 2026Updated last week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,187Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,536Updated this week