karpathy / minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆21,097Updated 5 months ago
Alternatives and similar repositories for minGPT:
Users that are interested in minGPT are comparing it to the libraries listed below
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆38,694Updated last month
- Train transformer language models with reinforcement learning.☆10,661Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆20,624Updated 2 weeks ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆36,332Updated this week
- Ongoing research training transformer models at scale☆11,164Updated this week
- Repo for external large-scale work☆6,518Updated 8 months ago
- Fast and memory-efficient exact attention☆15,164Updated last week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…☆13,034Updated this week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆17,052Updated this week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆8,197Updated this week
- GPT-3: Language Models are Few-Shot Learners☆15,716Updated 4 years ago
- Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!☆35,433Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆37,569Updated this week
- Inference Llama 2 in one file of pure C☆17,897Updated 5 months ago
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆10,968Updated 5 months ago
- Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch☆11,206Updated 8 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆29,758Updated 6 months ago
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.☆13,125Updated 3 months ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆30,841Updated 2 weeks ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"☆11,155Updated last month
- Making large AI models cheaper, faster and more accessible☆39,026Updated this week
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries☆7,051Updated this week
- This repository contains demos I made with the Transformers library by HuggingFace.☆9,830Updated last week
- LLM training in simple, raw C/CUDA☆25,111Updated 3 months ago
- Instruct-tune LLaMA on consumer hardware☆18,768Updated 5 months ago
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆8,939Updated last week
- Running large language models on a single GPU for throughput-oriented scenarios.☆9,256Updated 2 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆21,193Updated 5 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,173Updated 7 months ago
- Inference code for Llama models☆57,305Updated 5 months ago