JetRunner / PABEE
Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".
☆65Updated 3 years ago
Alternatives and similar repositories for PABEE:
Users that are interested in PABEE are comparing it to the libraries listed below
- Code for EMNLP 2020 paper CoDIR☆41Updated 2 years ago
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆61Updated last year
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆72Updated 3 years ago
- Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"☆46Updated 2 years ago
- Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.☆52Updated 2 years ago
- 🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…☆49Updated 2 years ago
- ☆63Updated 2 years ago
- ☆47Updated 4 years ago
- ☆116Updated 2 years ago
- ☆19Updated 4 years ago
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆52Updated last year
- ☆43Updated 3 years ago
- ☆66Updated 3 years ago
- ☆54Updated 2 years ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated 2 years ago
- ☆41Updated 4 years ago
- A unified approach to explain conditional text generation models. Pytorch. The code of paper "Local Explanation of Dialogue Response Gene…☆17Updated 2 years ago
- Implementation of Mixout with PyTorch☆74Updated 2 years ago
- Code and datasets for the EMNLP 2020 paper "Calibration of Pre-trained Transformers"☆58Updated last year
- Code for EMNLP 2021 paper: Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting☆17Updated 3 years ago
- The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach …☆63Updated 4 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Updated 3 years ago
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆38Updated 2 years ago
- Rationales for Sequential Predictions☆40Updated 3 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆30Updated 3 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…☆49Updated 3 years ago
- [NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…☆140Updated 3 years ago
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Updated 2 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆47Updated 2 years ago
- EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering☆70Updated 3 years ago