laiguokun / bert-cloth
☆40Updated 4 years ago
Alternatives and similar repositories for bert-cloth:
Users that are interested in bert-cloth are comparing it to the libraries listed below
- Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"☆126Updated 3 years ago
- ☆78Updated 2 years ago
- Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…☆250Updated 3 years ago
- Code for the RecAdam paper: Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting.☆115Updated 4 years ago
- Notes of my introduction about NLP in Fudan University☆37Updated 3 years ago
- ☆50Updated last year
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆91Updated 3 years ago
- For the code release of our arXiv paper "Revisiting Few-sample BERT Fine-tuning" (https://arxiv.org/abs/2006.05987).☆184Updated last year
- Source code for "Efficient Training of BERT by Progressively Stacking"☆112Updated 5 years ago
- ☆69Updated 4 years ago
- Pretrain CPM-1☆51Updated 3 years ago
- Deep learning images developed from nvidia/cuda-cudnn-devel-ubuntu.☆23Updated 2 years ago
- DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference☆154Updated 2 years ago
- A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline☆26Updated 3 years ago
- Codes for our paper at EMNLP2019☆36Updated 5 years ago
- Code for ACL 2019 paper: "Searching for Effective Neural Extractive Summarization: What Works and What's Next"☆90Updated 3 years ago
- The source code of our ACL2019 paper "Incremental Transformer with Deliberation Decoder for Document Grounded Conversations "☆86Updated 5 years ago
- "Target-Guided Open-Domain Conversation" in ACL 2019☆148Updated 5 years ago
- LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts (AAAI 2019)☆123Updated 6 years ago
- ☆83Updated 5 years ago
- Implementation of Neural Machine Translation by jointly learning to align and translate☆26Updated 7 years ago
- ☆75Updated 2 years ago
- Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation☆127Updated 4 years ago
- LAMB Optimizer for Large Batch Training (TensorFlow version)☆120Updated 5 years ago
- Code for the ICML'20 paper "Improving Transformer Optimization Through Better Initialization"☆89Updated 4 years ago
- Differentiable Product Quantization for End-to-End Embedding Compression.☆59Updated 2 years ago
- Re-implement "QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension"☆120Updated 6 years ago
- PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"☆268Updated 3 years ago
- Code for the paper "Are Sixteen Heads Really Better than One?"☆171Updated 4 years ago
- Must-read papers on improving efficiency for pre-trained language models.☆102Updated 2 years ago