karpathy / minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆20,199Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for minGPT
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆37,411Updated 3 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆20,194Updated last week
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆135,166Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆35,508Updated this week
- Fast and memory-efficient exact attention☆14,279Updated this week
- Train transformer language models with reinforcement learning.☆10,086Updated this week
- State-of-the-Art Text Embeddings☆15,368Updated this week
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries☆6,947Updated this week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆16,471Updated this week
- Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!☆34,030Updated this week
- A library for efficient similarity search and clustering of dense vectors.☆31,488Updated this week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆7,958Updated this week
- Ongoing research training transformer models at scale☆10,595Updated this week
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.☆12,427Updated last month
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)☆6,952Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆29,561Updated 4 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆12,672Updated this week
- Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM☆7,705Updated 10 months ago
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more☆30,532Updated this week
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"☆10,776Updated 3 months ago
- Development repository for the Triton language and compiler☆13,443Updated this week
- Repo for external large-scale work☆6,516Updated 6 months ago
- Unsupervised text tokenizer for Neural Network-based text generation.☆10,295Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆30,423Updated this week
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…☆5,994Updated 2 months ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆36,993Updated this week
- Instruct-tune LLaMA on consumer hardware☆18,653Updated 3 months ago
- LlamaIndex is a data framework for your LLM applications☆36,820Updated this week
- Code for the paper "Language Models are Unsupervised Multitask Learners"☆22,546Updated 3 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,059Updated 5 months ago