JetRunner / PABEE
Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".
☆64Updated 3 years ago
Alternatives and similar repositories for PABEE:
Users that are interested in PABEE are comparing it to the libraries listed below
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆59Updated last year
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆71Updated 3 years ago
- Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.☆51Updated 2 years ago
- EMNLP BlackBox NLP 2020: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples☆23Updated 4 years ago
- ☆47Updated 4 years ago
- Code for EMNLP 2020 paper CoDIR☆41Updated 2 years ago
- ☆63Updated 2 years ago
- Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"☆45Updated 2 years ago
- The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach …☆62Updated 4 years ago
- ☆116Updated 2 years ago
- [NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining☆118Updated last year
- ☆19Updated 4 years ago
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- ☆41Updated 3 years ago
- Sequence-Level Mixed Sample Data Augmentation☆21Updated 3 years ago
- Code for the paper "Are Sixteen Heads Really Better than One?"☆171Updated 4 years ago
- ☆66Updated 3 years ago
- Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data☆56Updated 3 years ago
- Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)☆17Updated 3 years ago
- ☆21Updated 5 years ago
- Code for the paper "Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers"☆17Updated 4 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆30Updated 3 years ago
- Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"☆45Updated 2 years ago
- domain adaptation in NLP☆52Updated 3 years ago
- Code associated with the paper **SkipBERT: Efficient Inference with Shallow Layer Skipping**, at ACL 2022☆16Updated 2 years ago
- A pre-trained model with multi-exit transformer architecture.☆55Updated 2 years ago
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆52Updated last year
- Code for ACL 2023 paper titled "Lifting the Curse of Capacity Gap in Distilling Language Models"☆28Updated last year
- Code and datasets for the EMNLP 2020 paper "Calibration of Pre-trained Transformers"☆57Updated last year
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 3 years ago