castorini / berxitLinks

☆22

Alternatives and similar repositories for berxit

Users that are interested in berxit are comparing it to the libraries listed below

Sorting:

IBM / PoWER-BERT
Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…
☆61Updated 2 months ago
JetRunner / PABEE
Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".
☆65Updated 4 years ago
huggingface / block_movement_pruning
Block Sparse movement pruning
☆81Updated 4 years ago
jungokasai / deep-shallow
☆44Updated 4 years ago
clovaai / length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆101Updated 4 years ago
microsoft / Stochastic-Mixture-of-Experts
This package implements THOR: Transformer with Stochastic Experts.
☆65Updated 3 years ago
vklabmipt / implicit-unlikelihood-training
Improving Neural Text Generation with Reinforcement Learning
☆22Updated 4 years ago
intersun / CoDIR
Code for EMNLP 2020 paper CoDIR
☆41Updated 2 years ago
kernelmachine / demix
DEMix Layers for Modular Language Modeling
☆53Updated 3 years ago
fuzihaofzh / repetition-problem-nlg
Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.
☆54Updated 2 years ago
microsoft / SEED-Encoder
☆45Updated 3 years ago
haorannlp / mix
Code for "Mixed Cross Entropy Loss for Neural Machine Translation"
☆20Updated 3 years ago
frankxu2004 / knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆58Updated 2 years ago
allenai / staged-training
Staged Training for Transformer Language Models
☆32Updated 3 years ago
thunlp / Knowledge-Inheritance
Source code for paper: Knowledge Inheritance for Pre-trained Language Models
☆38Updated 3 years ago
microsoft / AMOS
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
☆24Updated last year
thunlp / TR-BERT
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"
☆47Updated 3 years ago
amazon-science / dq-bart
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)
☆50Updated 2 years ago
romebert / RomeBERT
☆16Updated 4 years ago
RUCAIBox / ELMER
This repository is the official implementation of our EMNLP 2022 paper ELMER: A Non-Autoregressive Pre-trained Language Model for Efficie…
☆26Updated 2 years ago
clovaai / pkm-transformers
Official implementation of PKM-augmented language models (Findings of EMNLP 2020)
☆10Updated 4 years ago
tanyuqian / ctc-gen-eval
EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation
☆97Updated 2 years ago
jungokasai / twist_decoding
☆29Updated 3 years ago
allenai / data-efficient-finetuning
Code for paper 'Data-Efficient FineTuning'
☆29Updated 2 years ago
jxhe / efficient-knnlm
Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)
☆73Updated 3 years ago
castorini / DeeBERT
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
☆157Updated 3 years ago
ictnlp / TLAT-NMT
Source code for the EMNLP 2020 long paper <Token-level Adaptive Training for Neural Machine Translation>.
☆20Updated 2 years ago
facebookresearch / ELECTRA-Fewshot-Learning
This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.
☆48Updated 3 years ago
cliang1453 / SAGE
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
☆30Updated 3 years ago
sunyt32 / torchscale
Transformers at any scale
☆41Updated last year