BlinkDL / minGPT-tuned
A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆107Updated 3 years ago
Alternatives and similar repositories for minGPT-tuned:
Users that are interested in minGPT-tuned are comparing it to the libraries listed below
- 基于Transformer的单模型、多尺度的VAE模型☆55Updated 3 years ago
- [NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining☆118Updated last year
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆97Updated 2 years ago
- Transformers at any scale☆41Updated last year
- ☆24Updated 2 years ago
- ☆116Updated 2 years ago
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆73Updated 2 years ago
- Code for the paper "Scheduled Sampling for Transformers"☆25Updated 5 years ago
- Unicoder model for understanding and generation.☆89Updated last year
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆38Updated 2 years ago
- Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"☆70Updated last year
- K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (Findings of EMNLP …☆31Updated 2 years ago
- ☆50Updated last year
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 4 years ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.☆52Updated 2 years ago
- Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"☆46Updated 2 years ago
- Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch☆70Updated 4 years ago
- DSTC10 Track1 - MOD: Internet Meme Incorporated Open-domain Dialog☆50Updated 2 years ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 4 years ago
- Ladder Side-Tuning在CLUE上的简单尝试☆19Updated 2 years ago
- ☆46Updated 3 years ago
- FairSeq repo with Apollo optimizer☆110Updated last year
- Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"☆127Updated 3 years ago
- ☆32Updated 3 years ago
- codes and pre-trained models of paper "Segatron: Segment-aware Transformer for Language Modeling and Understanding"☆18Updated 2 years ago
- Open source code and data for AAAI 2022 Oral Paper "Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding"☆33Updated 9 months ago
- A simple module consistently outperforms self-attention and Transformer model on main NMT datasets with SoTA performance.☆86Updated last year
- FLASHQuad_pytorch☆67Updated 2 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆32Updated last year