A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
β23,950Aug 15, 2024Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β55,432Nov 12, 2025Updated 4 months ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β158,424Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,869Mar 18, 2026Updated last week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,952Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIβ15,147Aug 8, 2024Updated last year
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ35,190Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,190Sep 30, 2025Updated 5 months ago
- Inference Llama 2 in one file of pure Cβ19,302Aug 6, 2024Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"β24,707Aug 14, 2024Updated last year
- Google Researchβ37,494Mar 18, 2026Updated last week
- LLM training in simple, raw C/CUDAβ29,216Jun 26, 2025Updated 9 months ago
- Neural Networks: Zero to Heroβ21,025Aug 18, 2024Updated last year
- Making large AI models cheaper, faster and more accessibleβ41,376Mar 16, 2026Updated last week
- Fast and memory-efficient exact attentionβ22,938Updated this week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,059Jan 23, 2026Updated 2 months ago
- LLM inference in C/C++β98,911Updated this week
- Train transformer language models with reinforcement learning.β17,781Updated this week
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,256Jul 17, 2024Updated last year
- Inference code for Llama modelsβ59,250Jan 26, 2025Updated last year
- You like pytorch? You like micrograd? You love tinygrad! β€οΈβ31,715Updated this week
- A library for efficient similarity search and clustering of dense vectors.β39,484Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ74,135Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,438Feb 20, 2026Updated last month
- NordVPN Threat Protection Proβ’ β’ AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,445Jun 2, 2025Updated 9 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β10,385Jul 1, 2024Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,157Updated this week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ32,946Feb 18, 2026Updated last month
- Ongoing research training transformer models at scaleβ15,744Mar 20, 2026Updated last week
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β36,538Mar 18, 2026Updated last week
- Development repository for the Triton language and compilerβ18,708Updated this week
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ98,480Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,435Aug 17, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The agent engineering platformβ130,454Mar 20, 2026Updated last week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.β41,799Mar 20, 2026Updated last week
- LlamaIndex is the leading document agent and OCR platformβ47,963Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β42,151Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,580Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,841Mar 18, 2026Updated last week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,431Mar 5, 2026Updated 3 weeks ago