karpathy / minGPTLinks

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

☆23,075

Alternatives and similar repositories for minGPT

Users that are interested in minGPT are comparing it to the libraries listed below

Sorting:

karpathy / nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆50,560Updated 3 weeks ago
karpathy / micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
☆13,943Updated last year
google / sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
☆11,474Updated 2 weeks ago
huggingface / accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…
☆9,348Updated this week
karpathy / makemore
An autoregressive character-level language model for making more things
☆3,492Updated last year
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,778Updated last year
karpathy / llama2.c
Inference Llama 2 in one file of pure C
☆18,995Updated last year
huggingface / trl
Train transformer language models with reinforcement learning.
☆16,552Updated this week
Lightning-AI / pytorch-lightning
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
☆30,546Updated this week
haotian-liu / LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,108Updated last year
karpathy / minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
☆10,195Updated last year
ggml-org / ggml
Tensor library for machine learning
☆13,648Updated last week
tinygrad / tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
☆30,788Updated this week
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,087Updated 5 months ago
openai / tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
☆16,717Updated 2 months ago
Dao-AILab / flash-attention
Fast and memory-efficient exact attention
☆20,904Updated this week
deepspeedai / DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆40,890Updated last week
triton-lang / triton
Development repository for the Triton language and compiler
☆17,730Updated this week
microsoft / LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
☆13,010Updated 11 months ago
huggingface / peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆20,215Updated this week
huggingface / tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
☆10,279Updated this week
jaymody / picoGPT
An unnecessarily tiny implementation of GPT-2 in NumPy.
☆3,423Updated 2 years ago
BlinkDL / RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…
☆14,203Updated 3 weeks ago
karpathy / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆4,595Updated last year
karpathy / nn-zero-to-hero
Neural Networks: Zero to Hero
☆18,990Updated last year
lucidrains / x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
☆5,706Updated last month
NVIDIA / Megatron-LM
Ongoing research training transformer models at scale
☆14,389Updated this week
microsoft / unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆21,866Updated 5 months ago
tatsu-lab / stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,236Updated last year
google-research / google-research
Google Research
☆36,832Updated this week