rish-16 / gpt3-pytorchLinks
Unofficial PyTorch Implementation of OpenAI's GPT-3
☆13Updated 3 years ago
Alternatives and similar repositories for gpt3-pytorch
Users that are interested in gpt3-pytorch are comparing it to the libraries listed below
Sorting:
- 逻辑回归和单层softmax的解析解☆12Updated 4 years ago
- A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆116Updated 4 years ago
- Virtual Adversarial Training (VAT) techniques in PyTorch☆17Updated 3 years ago
- ☆24Updated 2 years ago
- Finetune CPM-1☆24Updated 4 years ago
- A Translation Task using TurboTransformers☆11Updated 4 years ago
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆12Updated 3 years ago
- 中文原生等级化代码能力测试基准☆15Updated last year
- RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.☆67Updated 2 years ago
- Implementation of Multistream Transformers in Pytorch☆54Updated 4 years ago
- Transformers at any scale☆41Updated last year
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 3 years ago
- GPT2 finetuning with transformers 🤗☆28Updated 4 years ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆56Updated last year
- Official Repository for Efficient Linear-Time Attention Transformers.☆18Updated last year
- 基于Transformer的单模型、多尺度的VAE模型☆57Updated 4 years ago
- ☆14Updated last year
- PyTorch implementation of FNet: Mixing Tokens with Fourier transforms☆27Updated 4 years ago
- A Python implementation of Toolformer using Huggingface Transformers☆14Updated 2 years ago
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"☆70Updated 2 years ago
- An implementation of Compositional Attention: Disentangling Search and Retrieval by MILA☆14Updated 3 years ago
- ☆29Updated 2 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 3 years ago
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆82Updated 3 years ago
- Albert for Conversational Question Answering Challenge☆23Updated 2 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆76Updated 2 years ago
- A variant of Transformer-XL where the memory is updated not with a queue, but with attention☆49Updated 5 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆50Updated 3 years ago
- JAX implementation of the bart-base model☆32Updated 2 years ago