BlinkDL / minGPT-tuned
A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆107Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for minGPT-tuned
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆97Updated last year
- 基于Transformer的单模型、多尺度的VAE模型☆53Updated 3 years ago
- Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"☆71Updated last year
- FLASHQuad_pytorch☆66Updated 2 years ago
- Axial Positional Embedding for Pytorch☆60Updated 3 years ago
- A pytorch &keras implementation and demo of Fastformer.☆187Updated 2 years ago
- ☆45Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆106Updated 4 years ago
- Official code for ICLR 2022 paper: "PoNet: Pooling Network for Efficient Token Mixing in Long Sequences".☆31Updated last year
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 3 years ago
- ICLR2023 - Tailoring Language Generation Models under Total Variation Distance