A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β23,746Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β54,071Nov 12, 2025Updated 3 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β157,071Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,706Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,884Updated this week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ34,987Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,170Sep 30, 2025Updated 5 months ago
- Inference Llama 2 in one file of pure Cβ19,213Aug 6, 2024Updated last year
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ14,842Aug 8, 2024Updated last year
- Google Researchβ37,367Updated this week
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,648Aug 14, 2024Updated last year
- Making large AI models cheaper, faster and more accessibleβ41,359Feb 23, 2026Updated last week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,030Jan 23, 2026Updated last month
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ31,471Updated this week
- Fast and memory-efficient exact attentionβ22,460Updated this week
- LLM inference in C/C++β96,322Updated this week
- A library for efficient similarity search and clustering of dense vectors.β39,255Updated this week
- Inference code for Llama modelsβ59,183Jan 26, 2025Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,271Jul 17, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMsβ71,883Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,426Jun 2, 2025Updated 9 months ago
- LLM training in simple, raw C/CUDAβ28,993Jun 26, 2025Updated 8 months ago
- Train transformer language models with reinforcement learning.β17,460Feb 26, 2026Updated last week
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β32,873Feb 26, 2026Updated last week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,444Aug 17, 2024Updated last year
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ32,642Feb 18, 2026Updated 2 weeks ago
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,415Feb 20, 2026Updated last week
- Development repository for the Triton language and compilerβ18,501Updated this week
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β36,420Feb 26, 2026Updated last week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β41,516Updated this week
- π¦π The platform for reliable agents.β127,809Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ97,870Updated this week
- LlamaIndex is the leading document agent and OCR platformβ47,210Feb 26, 2026Updated last week
- Ongoing research training transformer models at scaleβ15,461Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,347Jul 1, 2024Updated last year
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β41,855Updated this week
- Neural Networks: Zero to Heroβ20,598Aug 18, 2024Updated last year
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,513Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,393Feb 21, 2026Updated last week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,717Updated this week