pytorch implementation for Patient Knowledge Distillation for BERT Model Compression
☆203Sep 20, 2019Updated 6 years ago
Alternatives and similar repositories for PKD-for-BERT-Model-Compression
Users that are interested in PKD-for-BERT-Model-Compression are comparing it to the libraries listed below
Sorting:
- ☆15Sep 10, 2019Updated 6 years ago
- BERT distillation(基于BERT的蒸馏实验 )☆314Jul 30, 2020Updated 5 years ago
- Code for EMNLP 2020 paper CoDIR☆41Oct 4, 2022Updated 3 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).☆315Jun 12, 2023Updated 2 years ago
- Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.☆3,156Jan 22, 2024Updated 2 years ago
- A PyTorch-based knowledge distillation toolkit for natural language processing☆1,696May 8, 2023Updated 2 years ago
- ☆61Nov 14, 2019Updated 6 years ago
- The score code of FastBERT (ACL2020)☆609Oct 29, 2021Updated 4 years ago
- ☆17May 14, 2020Updated 5 years ago
- Awesome Knowledge Distillation☆3,820Dec 25, 2025Updated 2 months ago
- Multi-Task Deep Neural Networks for Natural Language Understanding☆2,258Mar 7, 2024Updated last year
- Knowledge Distillation from BERT☆54Jan 7, 2019Updated 7 years ago
- A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型☆3,984Nov 21, 2022Updated 3 years ago
- Code for the paper "Are Sixteen Heads Really Better than One?"☆175Apr 1, 2020Updated 5 years ago
- AIR retriever for Multi-Hop QA (ACL 2020 paper)☆30Jul 18, 2020Updated 5 years ago
- Implementation for NATv2.☆23Feb 20, 2021Updated 5 years ago
- Adversarial Training for Natural Language Understanding☆253Sep 6, 2023Updated 2 years ago
- LGEB: Benchmark of Language Generation Evaluation☆16Oct 21, 2022Updated 3 years ago
- A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning☆89Nov 19, 2019Updated 6 years ago
- Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".☆129Jun 30, 2021Updated 4 years ago
- 论文实现(ACL2019):《Matching the Blanks: Distributional Similarity for Relation Learning》☆153Dec 8, 2022Updated 3 years ago
- DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference☆162Mar 25, 2022Updated 3 years ago
- Code for AAAI 2022 paper Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives☆23Jun 14, 2022Updated 3 years ago
- Data Augmentation for NLP. NLP数据增强☆294Dec 10, 2020Updated 5 years ago
- Joint Slot Filling and Intent Detection via Capsule Neural Networks (ACL'19) https://arxiv.org/abs/1812.09471☆139Mar 24, 2023Updated 2 years ago
- Source code of K-BERT (AAAI2020)☆984Jan 27, 2023Updated 3 years ago
- BERT for Multitask Learning☆543Apr 12, 2023Updated 2 years ago
- Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。☆2,654May 30, 2023Updated 2 years ago
- A Lite Bert For Self-Supervised Learning Language Representations☆714May 13, 2020Updated 5 years ago
- RoBERTa中文预训练模型: RoBERTa for Chinese☆2,774Jul 22, 2024Updated last year
- Knowledge Distillation: CVPR2020 Oral, Revisiting Knowledge Distillation via Label Smoothing Regularization☆585Feb 15, 2023Updated 3 years ago
- BERT with History Answer Embedding for Conversational Question Answering☆113May 10, 2021Updated 4 years ago
- Pre-Trained Chinese XLNet(中文XLNet预训练模型)☆1,650Jul 15, 2025Updated 7 months ago
- A BERT-based Chinese Text Encoder Enhanced by N-gram Representations☆647Jul 24, 2022Updated 3 years ago
- Scripts to train a bidirectional LSTM with knowledge distillation from BERT☆159Nov 21, 2019Updated 6 years ago
- 基于预训练模型 BERT 的阅读理解☆96Nov 18, 2025Updated 3 months ago
- [ACL 2020] DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering☆121May 22, 2023Updated 2 years ago
- Code for using and evaluating SpanBERT.☆904Jul 25, 2023Updated 2 years ago
- Question Answering with Interactive Text (QAit), code for EMNLP 2019 paper "Interactive Language Learning by Question Answering"☆44Sep 3, 2019Updated 6 years ago