A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β24,112Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β56,599Nov 12, 2025Updated 5 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β159,455Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,029Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β31,042Apr 7, 2026Updated last week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ15,431Aug 8, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,370Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,201Sep 30, 2025Updated 6 months ago
- Inference Llama 2 in one file of pure Cβ19,379Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,753Aug 14, 2024Updated last year
- Google Researchβ37,679Apr 9, 2026Updated last week
- LLM training in simple, raw C/CUDAβ29,511Jun 26, 2025Updated 9 months ago
- Neural Networks: Zero to Heroβ21,385Aug 18, 2024Updated last year
- Making large AI models cheaper, faster and more accessibleβ41,364Updated this week
- Fast and memory-efficient exact attentionβ23,344Updated this week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,086Jan 23, 2026Updated 2 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,260Jul 17, 2024Updated last year
- LLM inference in C/C++β103,237Updated this week
- Train transformer language models with reinforcement learning.β18,054Updated this week
- Inference code for Llama modelsβ59,324Jan 26, 2025Updated last year
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ32,330Updated this week
- A library for efficient similarity search and clustering of dense vectors.β39,720Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ76,536Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,456Apr 9, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,448Jun 2, 2025Updated 10 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,417Jul 1, 2024Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,336Updated this week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ33,184Mar 25, 2026Updated 3 weeks ago
- Ongoing research training transformer models at scaleβ15,985Updated this week
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦