williamFalcon / minGPTLinks

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

☆27

Alternatives and similar repositories for minGPT

Users that are interested in minGPT are comparing it to the libraries listed below

Sorting:

allenai / tpu_pretrain
LM Pretraining with PyTorch/TPU
☆135Updated 5 years ago
gsarti / t5-flax-gcp
Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP
☆58Updated 2 years ago
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆112Updated 2 years ago
ofirpress / shortformer
Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.
☆147Updated 3 years ago
lucidrains / charformer-pytorch
Implementation of the GBST block from the Charformer paper, in Pytorch
☆117Updated 4 years ago
clovaai / length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆101Updated 4 years ago
cimeister / typical-sampling
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
☆82Updated 3 years ago
cccwam / rc2020_electra
ML Reproducibility Challenge 2020: Electra reimplementation using PyTorch and Transformers
☆12Updated 4 years ago
microsoft / xtreme-distil-transformers
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale
☆155Updated last year
nreimers / flax-sentence-embeddings
Shared code for training sentence embeddings with Flax / JAX
☆27Updated 4 years ago
heartcored98 / transformer_anatomy
Official Pytorch implementation of (Roles and Utilization of Attention Heads in Transformer-based Neural Language Models), ACL 2020
☆16Updated 4 months ago
tshrjn / Finetune-QA
BERT, RoBERTa fine-tuning over SQuAD Dataset using pytorch-lightning⚡️, 🤗-transformers & 🤗-nlp.
☆36Updated 2 years ago
huggingface / olm-training
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆93Updated 2 years ago
lucidrains / electra-pytorch
A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch
☆227Updated 2 years ago
wietsedv / gpt2-recycle
As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)
☆48Updated 3 years ago
zphang / minimal-opt
☆67Updated 2 years ago
ofirpress / sandwich_transformer
This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer …
☆55Updated 4 years ago
lucidrains / marge-pytorch
Implementation of Marge, Pre-training via Paraphrasing, in Pytorch
☆76Updated 4 years ago
nng555 / ssmba
☆62Updated 3 years ago
yang-zhang / lightning-language-modeling
Language Modeling Example with Transformers and PyTorch Lighting
☆65Updated 4 years ago
elephantmipt / bert-distillation
Distillation of BERT model with catalyst framework
☆78Updated 2 years ago
jwieting / paraphrastic-representations-at-scale
☆75Updated 4 years ago
monologg / EncT5
Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks
☆63Updated 3 years ago
allenai / sledgehammer
☆47Updated 5 years ago
uds-lsv / bert-stable-fine-tuning
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
☆136Updated last year
pytorch-tpu / examples
This repository contains example code to build models on TPUs
☆30Updated 2 years ago
google-research / t5x_retrieval
☆100Updated 2 years ago
wangcongcong123 / ttt
A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+
☆38Updated 4 years ago
huggingface / olm-datasets
Pipeline for pulling and processing online language model pretraining data from the web
☆178Updated last year
CPJKU / wechsel
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
☆82Updated 10 months ago