BlinkDL / minGPT-tuned
A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆106Updated 3 years ago
Related projects: ⓘ
- 基于Transformer的单模型、多尺度的VAE模型☆53Updated 3 years ago
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆70Updated last year
- A pytorch &keras implementation and demo of Fastformer.☆184Updated last year
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆95Updated last year
- Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"☆72Updated last year
- ICLR2023 - Tailoring Language Generation Models under Total Variation Distance☆21Updated last year
- FLASHQuad_pytorch☆66Updated 2 years ago
- Ladder Side-Tuning在CLUE上的简单尝试☆19Updated 2 years ago
- ☆58Updated 2 years ago
- ☆44Updated 2 years ago
- UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning☆69Updated 3 years ago
- Axial Positional Embedding for Pytorch☆61Updated 3 years ago
- WuDaoMM this is a data project☆65Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆106Updated 3 years ago
- FlatNCE: A Novel Contrastive Representation Learning Objective☆83Updated 2 years ago
- Multitask Multilingual Multimodal Pre-training☆68Updated last year
- [NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining☆120Updated last year
- ☆50Updated this week
- Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch☆70Updated 4 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆116Updated 3 years ago
- [ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention☆178Updated last year
- A pytorch realization of adafactor (https://arxiv.org/pdf/1804.04235.pdf )☆24Updated 5 years ago
- Implementation of Fast Transformer in Pytorch☆171Updated 3 years ago
- Unicoder model for understanding and generation.☆88Updated 9 months ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 3 years ago
- ☆72Updated 2 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆188Updated last year
- 逻辑回归和单层softmax的解析解☆12Updated 3 years ago
- ☆65Updated 3 weeks ago
- Must-read papers on improving efficiency for pre-trained language models.☆100Updated last year