A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β24,275Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β57,469Nov 12, 2025Updated 5 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β160,288Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,231Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β31,104Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ15,710Aug 8, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,536Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,212Sep 30, 2025Updated 7 months ago
- Inference Llama 2 in one file of pure Cβ19,460Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,815Aug 14, 2024Updated last year
- Google Researchβ37,825Apr 30, 2026Updated last week
- LLM training in simple, raw C/CUDAβ29,780Jun 26, 2025Updated 10 months ago
- Neural Networks: Zero to Heroβ21,696Aug 18, 2024Updated last year
- Making large AI models cheaper, faster and more accessibleβ41,379Apr 27, 2026Updated last week
- Fast and memory-efficient exact attentionβ23,628Updated this week
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,114Jan 23, 2026Updated 3 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,258Jul 17, 2024Updated last year
- Train transformer language models with reinforcement learning.β18,282Updated this week
- LLM inference in C/C++β107,892Updated this week
- Inference code for Llama modelsβ59,382Jan 26, 2025Updated last year
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ32,603Updated this week
- A library for efficient similarity search and clustering of dense vectors.β39,918Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ78,979Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,476Apr 19, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,463Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,462Jul 1, 2024Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,556Updated this week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ33,398Mar 25, 2026Updated last month
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β36,739Apr 29, 2026Updated last week
- Ongoing research training transformer models at scaleβ16,203Updated this week
- Development repository for the Triton language and compilerβ19,087Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ99,586Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,409Aug 17, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The agent engineering platformβ135,612Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β42,373Apr 30, 2026Updated last week
- LlamaIndex is the leading document agent and OCR platformβ49,127Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β42,505Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,658Apr 29, 2026Updated last week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,052Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,503Apr 28, 2026Updated last week