karpathy / minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
ā21,817Updated 8 months ago
Alternatives and similar repositories for minGPT:
Users that are interested in minGPT are comparing it to the libraries listed below
- The simplest, fastest repository for training/finetuning medium-sized GPTs.ā40,976Updated 4 months ago
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like APIā11,745Updated 8 months ago
- š¤ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.ā143,804Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.ā31,390Updated 3 months ago
- Fast and memory-efficient exact attentionā17,192Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.ā38,206Updated this week
- Train transformer language models with reinforcement learning.ā13,559Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)ā¦ā13,564Updated this week
- TensorFlow code and pre-trained models for BERTā39,081Updated 9 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesā21,171Updated 2 months ago
- Inference Llama 2 in one file of pure Cā18,321Updated 9 months ago
- Ongoing research training transformer models at scaleā12,261Updated this week
- š A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iā¦ā8,673Updated this week
- Code and documentation to train Stanford's Alpaca models, and generate the data.ā29,972Updated 9 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.ā9,611Updated 10 months ago
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)ā8,882Updated last week
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)ā7,373Updated last year
- š¤ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.ā18,274Updated this week
- Unsupervised text tokenizer for Neural Network-based text generation.ā10,836Updated last month
- You like pytorch? You like micrograd? You love tinygrad! ā¤ļøā28,645Updated this week
- Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLMā7,794Updated last week
- Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.ā29,384Updated last week
- Inference code for Llama modelsā58,174Updated 3 months ago
- A library for efficient similarity search and clustering of dense vectors.ā34,661Updated this week
- A concise but complete full-attention transformer with a set of promising experimental features from various papersā5,288Updated last week
- Repo for external large-scale workā6,524Updated last year
- State-of-the-Art Text Embeddingsā16,611Updated last week
- š„ Fast State-of-the-Art Tokenizers optimized for Research and Productionā9,646Updated 2 weeks ago
- Instruct-tune LLaMA on consumer hardwareā18,902Updated 9 months ago
- Running large language models on a single GPU for throughput-oriented scenarios.ā9,312Updated 6 months ago