karpathy / minGPTLinks
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆22,314Updated 11 months ago
Alternatives and similar repositories for minGPT
Users that are interested in minGPT are comparing it to the libraries listed below
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆43,152Updated 7 months ago
- State-of-the-Art Text Embeddings☆17,187Updated this week
- Ongoing research training transformer models at scale☆12,960Updated this week
- Repo for external large-scale work☆6,528Updated last year
- A playbook for systematically maximizing the performance of deep learning models.☆28,978Updated last year
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆8,951Updated last week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆39,467Updated this week
- Fast and memory-efficient exact attention☆18,551Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆21,572Updated 3 weeks ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,772Updated last year
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…☆13,832Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆31,656Updated last month
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image☆29,907Updated last year
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆12,389Updated 11 months ago
- A library for efficient similarity search and clustering of dense vectors.☆36,305Updated this week
- 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…☆147,239Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,572Updated last year
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)☆7,549Updated last month
- Code for the paper "Language Models are Unsupervised Multitask Learners"☆23,920Updated 11 months ago
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries☆7,266Updated 3 weeks ago
- Unsupervised text tokenizer for Neural Network-based text generation.☆11,105Updated last week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)☆9,053Updated 3 weeks ago
- Train transformer language models with reinforcement learning.☆14,675Updated this week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆19,116Updated this week
- llama3 implementation one matrix multiplication at a time☆15,050Updated last year
- Google Research☆36,062Updated last week
- An autoregressive character-level language model for making more things☆3,191Updated last year
- Inference Llama 2 in one file of pure C☆18,582Updated 11 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆30,087Updated last year
- An unnecessarily tiny implementation of GPT-2 in NumPy.☆3,391Updated 2 years ago