robeld / ERNIE
Open Source Neural Machine Translation in PyTorch
☆17Updated 5 years ago
Alternatives and similar repositories for ERNIE:
Users that are interested in ERNIE are comparing it to the libraries listed below
- Implementation of a Quantized Transformer Model☆18Updated 5 years ago
- DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference☆154Updated 2 years ago
- Block Sparse movement pruning☆78Updated 4 years ago
- [KDD'22] Learned Token Pruning for Transformers☆96Updated last year
- Compression of NMT transformer model with tensor methods☆48Updated 5 years ago
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆59Updated last year
- Code for the paper "Are Sixteen Heads Really Better than One?"☆171Updated 4 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆101Updated 4 years ago
- [ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing☆331Updated 7 months ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆192Updated last year
- [ICML'21 Oral] I-BERT: Integer-only BERT Quantization☆238Updated 2 years ago
- Source code for IJCAI 2022 Long paper: Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.☆13Updated 2 years ago
- ICLR2019, Multilingual Neural Machine Translation with Knowledge Distillation☆70Updated 4 years ago
- A collection of transformer's guides, implementations and variants.☆102Updated 5 years ago
- ☆13Updated 2 years ago
- ☆17Updated 4 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆32Updated last year
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆60Updated 2 years ago
- ☆199Updated 3 years ago
- Implementation of NeurIPS 2019 paper "Normalization Helps Training of Quantized LSTM"☆30Updated 6 months ago
- Pytorch library for factorized L0-based pruning.☆44Updated last year
- Code associated with the paper **SkipBERT: Efficient Inference with Shallow Layer Skipping**, at ACL 2022☆16Updated 2 years ago
- ☆63Updated 4 years ago
- Prune a model while finetuning or training.☆399Updated 2 years ago
- Block-sparse primitives for PyTorch☆153Updated 3 years ago
- Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".☆64Updated 3 years ago
- Parameter Efficient Transfer Learning with Diff Pruning☆73Updated 4 years ago
- [ICLR 2022] Code for paper "Exploring Extreme Parameter Compression for Pre-trained Language Models"(https://arxiv.org/abs/2205.10036)☆22Updated last year
- This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.☆87Updated last year
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers☆181Updated last year