lonePatient / MobileBert_PyTorchLinks
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
☆67Updated 5 years ago
Alternatives and similar repositories for MobileBert_PyTorch
Users that are interested in MobileBert_PyTorch are comparing it to the libraries listed below
Sorting:
- pytorch implementation for Patient Knowledge Distillation for BERT Model Compression☆201Updated 5 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆195Updated 2 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).☆312Updated last year
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆91Updated 3 years ago
- lightweighted deep learning inference service framework☆39Updated 3 years ago
- A general framework for knowledge distillation☆54Updated 4 years ago
- Open Source Neural Machine Translation in PyTorch☆17Updated 6 years ago
- A PyTorch implementation of Transformer in "Attention is All You Need"☆106Updated 4 years ago
- R-Drop方法在中文任务上的简单实验☆91Updated 3 years ago
- Knowledge Distillation For Transformer Language Models☆52Updated last year
- 离线端阅读理解应用 QA for mobile, Android & iPhone☆60Updated 2 years ago
- Implementation of RealFormer using pytorch☆100Updated 4 years ago
- ☆252Updated 2 years ago
- Code associated with the paper **SkipBERT: Efficient Inference with Shallow Layer Skipping**, at ACL 2022☆16Updated 2 years ago
- 对ACL2020 FastBERT论文的复现,论文地址//arxiv.org/pdf/2004.02178.pdf☆194Updated 3 years ago
- Code for the paper "Are Sixteen Heads Really Better than One?"☆172Updated 5 years ago
- For the code release of our arXiv paper "Revisiting Few-sample BERT Fine-tuning" (https://arxiv.org/abs/2006.05987).☆184Updated last year
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆32Updated last year
- bert-of-theseus via bert4keras☆31Updated 4 years ago
- 简洁易用版TinyBert:基于Bert进行知识蒸馏的预训练语言模型☆265Updated 4 years ago
- adafactor optimizer for keras☆20Updated 3 years ago
- tensorflow version of bert-of-theseus☆62Updated 4 years ago
- Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".☆65Updated 3 years ago
- 分享一些S2S在实际应用中遇到的问题和解决方法。☆27Updated 4 years ago
- 基于百度webqa与dureader数据集训练的Albert Large QA模型☆75Updated 5 years ago
- A PyTorch implementation of "Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation"☆56Updated 5 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Updated 2 years ago
- FLASHQuad_pytorch☆67Updated 3 years ago
- reformer-pytorch中文版本,简单高效的生成模型。类似GPT2的效果☆16Updated last year
- 高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task☆58Updated 5 years ago