A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β23,950Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β55,432Nov 12, 2025Updated 4 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β158,060Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,869Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,952Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ15,147Aug 8, 2024Updated last year
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,190Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,190Sep 30, 2025Updated 5 months ago
- Inference Llama 2 in one file of pure Cβ19,302Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,707Aug 14, 2024Updated last year
- Google Researchβ37,494Updated this week
- LLM training in simple, raw C/CUDAβ29,216Jun 26, 2025Updated 8 months ago
- Neural Networks: Zero to Heroβ21,025Aug 18, 2024Updated last year
- Making large AI models cheaper, faster and more accessibleβ41,362Mar 16, 2026Updated last week
- Fast and memory-efficient exact attentionβ22,938Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,059Jan 23, 2026Updated 2 months ago
- LLM inference in C/C++β98,911Updated this week
- Train transformer language models with reinforcement learning.β17,697Mar 18, 2026Updated last week
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,258Jul 17, 2024Updated last year
- Inference code for Llama modelsβ59,250Jan 26, 2025Updated last year
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ31,715Updated this week
- A library for efficient similarity search and clustering of dense vectors.β39,484Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ74,135Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,438Feb 20, 2026Updated last month
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,445Jun 2, 2025Updated 9 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,385Jul 1, 2024Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,085Mar 18, 2026Updated last week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ32,861Feb 18, 2026Updated last month
- Ongoing research training transformer models at scaleβ15,744Updated this week
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β36,538Updated this week
- Development repository for the Triton language and compilerβ18,708Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ98,480Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,433Aug 17, 2024Updated last year
- The agent engineering platformβ130,454Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β41,799Updated this week
- LlamaIndex is the leading document agent and OCR platformβ47,753Mar 17, 2026Updated last week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β42,053Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,563Mar 17, 2026Updated last week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,841Mar 18, 2026Updated last week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,431Mar 5, 2026Updated 2 weeks ago