karpathy / minGPTLinks
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆22,582Updated last year
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆44,331Updated 9 months ago
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆12,736Updated last year
- Fast and memory-efficient exact attention☆19,471Updated this week
- 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…☆149,853Updated this week
- Train transformer language models with reinforcement learning.☆15,520Updated this week
- Unsupervised text tokenizer for Neural Network-based text generation.☆11,270Updated this week
- Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.☆30,113Updated this week
- Code for the paper "Language Models are Unsupervised Multitask Learners"☆24,160Updated last year
- Inference Llama 2 in one file of pure C☆18,735Updated last year
- Ongoing research training transformer models at scale☆13,541Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,922Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆40,058Updated this week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆9,133Updated this week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆19,570Updated this week
- An autoregressive character-level language model for making more things☆3,304Updated last year
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more☆33,428Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…☆13,965Updated this week
- Development repository for the Triton language and compiler☆16,831Updated this week
- Inference code for Llama models☆58,737Updated 7 months ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"☆12,660Updated 9 months ago
- LLM training in simple, raw C/CUDA☆27,588Updated 2 months ago
- ☆4,193Updated last year
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.☆15,911Updated 2 weeks ago
- Repo for external large-scale work☆6,547Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,085Updated 3 weeks ago
- llama3 implementation one matrix multiplication at a time☆15,148Updated last year
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆31,777Updated last week
- Accessible large language models via k-bit quantization for PyTorch.☆7,584Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)☆9,164Updated last month
- Tensor library for machine learning☆13,134Updated last week