lonePatient / MobileBert_PyTorch
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
☆65Updated 4 years ago
Alternatives and similar repositories for MobileBert_PyTorch:
Users that are interested in MobileBert_PyTorch are comparing it to the libraries listed below
- pytorch implementation for Patient Knowledge Distillation for BERT Model Compression☆200Updated 5 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆91Updated 3 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆192Updated last year
- R-Drop方法在中文任务上的简单实验☆90Updated 2 years ago
- 对ACL2020 FastBERT论文的复现,论文地址//arxiv.org/pdf/2004.02178.pdf☆192Updated 3 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).☆310Updated last year
- ☆251Updated 2 years ago
- A PyTorch implementation of "Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation"☆56Updated 4 years ago
- 高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task☆57Updated 4 years ago
- Knowledge Distillation from BERT☆51Updated 6 years ago
- Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"☆44Updated 2 years ago
- Finetune CPM-1☆75Updated last year
- lightweighted deep learning inference service framework☆40Updated 3 years ago
- tensorflow version of bert-of-theseus☆62Updated 4 years ago
- 分享一些S2S在实际应用中遇到的问题和解决方法。☆27Updated 4 years ago
- ☆78Updated 2 years ago
- A PyTorch implementation of Transformer in "Attention is All You Need"☆103Updated 4 years ago
- 离线端阅读理解应用 QA for mobile, Android & iPhone☆60Updated 2 years ago
- reformer-pytorch中文版本,简单高效的生成模型。类似GPT2的效果☆16Updated last year
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆59Updated last year
- PyTorch implementations of algorithms for knowledge distillation.☆57Updated 4 years ago
- Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".☆64Updated 3 years ago
- ICLR2019, Multilingual Neural Machine Translation with Knowledge Distillation☆70Updated 4 years ago
- For the code release of our arXiv paper "Revisiting Few-sample BERT Fine-tuning" (https://arxiv.org/abs/2006.05987).☆184Updated last year
- ☆50Updated last year
- Code for the paper "Are Sixteen Heads Really Better than One?"☆171Updated 4 years ago
- Adversarial Training for NLP in Keras☆46Updated 4 years ago
- ☆86Updated 4 years ago
- TestB榜第10的方案,bleu32.1☆63Updated 5 years ago
- DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference☆153Updated 2 years ago