robeld / ERNIELinks

Open Source Neural Machine Translation in PyTorch

☆17

Alternatives and similar repositories for ERNIE

Users that are interested in ERNIE are comparing it to the libraries listed below

Sorting:

castorini / DeeBERT
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
☆157Updated 3 years ago
princeton-nlp / CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
☆196Updated 2 years ago
huggingface / nn_pruning
Prune a model while finetuning or training.
☆403Updated 3 years ago
mit-han-lab / hardware-aware-transformers
[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
☆335Updated last year
kssteven418 / I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
☆252Updated 2 years ago
kssteven418 / LTP
[KDD'22] Learned Token Pruning for Transformers
☆98Updated 2 years ago
huggingface / block_movement_pruning
Block Sparse movement pruning
☆81Updated 4 years ago
Andrew-Tierno / QuantizedTransformer
Implementation of a Quantized Transformer Model
☆19Updated 6 years ago
pmichel31415 / are-16-heads-really-better-than-1
Code for the paper "Are Sixteen Heads Really Better than One?"
☆172Updated 5 years ago
lonePatient / MobileBert_PyTorch
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
☆68Updated 5 years ago
intersun / PKD-for-BERT-Model-Compression
pytorch implementation for Patient Knowledge Distillation for BERT Model Compression
☆203Updated 5 years ago
WoosukKwon / retraining-free-pruning
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
☆190Updated 2 years ago
khakhulin / compressed-transformer
Compression of NMT transformer model with tensor methods
☆48Updated 6 years ago
IBM / PoWER-BERT
Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…
☆61Updated 2 months ago
IntelLabs / Model-Compression-Research-Package
A library for researching neural networks compression and acceleration methods.
☆139Updated 10 months ago
bzhangGo / rmsnorm
Root Mean Square Layer Normalization
☆245Updated 2 years ago
JetRunner / BERT-of-Theseus
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).
☆313Updated 2 years ago
tnq177 / transformers_without_tears
Transformers without Tears: Improving the Normalization of Self-Attention
☆132Updated last year
ymcui / LAMB_Optimizer_TF
LAMB Optimizer for Large Batch Training (TensorFlow version)
☆120Updated 5 years ago
bytedance / effective_transformer
Running BERT without Padding
☆472Updated 3 years ago
clovaai / length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆101Updated 4 years ago
qsyao / cudaBERT
A Fast Muti-processing BERT-Inference System
☆101Updated 2 years ago
mit-han-lab / lite-transformer
[ICLR 2020] Lite Transformer with Long-Short Range Attention
☆612Updated last year
RayeRen / multilingual-kd-pytorch
ICLR2019, Multilingual Neural Machine Translation with Knowledge Distillation
☆70Updated 4 years ago
Qualcomm-AI-research / transformer-quantization
☆206Updated 3 years ago
bytedance / ParaGen
ParaGen is a PyTorch deep learning framework for parallel sequence generation.
☆186Updated 2 years ago
yitu-opensource / ConvBert
☆251Updated 2 years ago
mitchellgordon95 / bert-prune
☆17Updated 5 years ago
lonePatient / electra_pytorch
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
☆91Updated 3 years ago
VITA-Group / BERT-Tickets
[NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…
☆140Updated 3 years ago