pytorch implementation for Patient Knowledge Distillation for BERT Model Compression
☆204Sep 20, 2019Updated 6 years ago
Alternatives and similar repositories for PKD-for-BERT-Model-Compression
Users that are interested in PKD-for-BERT-Model-Compression are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Sep 10, 2019Updated 6 years ago
- BERT distillation(基于BERT的蒸馏实验 )☆314Jul 30, 2020Updated 5 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).☆315Jun 12, 2023Updated 2 years ago
- Code for EMNLP 2020 paper CoDIR☆41Oct 4, 2022Updated 3 years ago
- Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.☆3,157Jan 22, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A PyTorch-based knowledge distillation toolkit for natural language processing☆1,697May 8, 2023Updated 2 years ago
- The score code of FastBERT (ACL2020)☆609Oct 29, 2021Updated 4 years ago
- ☆61Nov 14, 2019Updated 6 years ago
- Awesome Knowledge Distillation☆3,826Updated this week
- Knowledge Distillation from BERT☆54Jan 7, 2019Updated 7 years ago
- Multi-Task Deep Neural Networks for Natural Language Understanding☆2,257Mar 7, 2024Updated 2 years ago
- Code for the paper "Are Sixteen Heads Really Better than One?"☆175Apr 1, 2020Updated 5 years ago
- ☆17May 14, 2020Updated 5 years ago
- Adversarial Training for Natural Language Understanding☆253Sep 6, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型☆3,983Nov 21, 2022Updated 3 years ago
- Data Augmentation for NLP. NLP数据增强☆294Dec 10, 2020Updated 5 years ago
- A PyTorch implementation of "Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation"☆56Mar 4, 2020Updated 6 years ago
- Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".☆129Jun 30, 2021Updated 4 years ago
- LGEB: Benchmark of Language Generation Evaluation☆16Oct 21, 2022Updated 3 years ago
- Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。☆2,657May 30, 2023Updated 2 years ago
- Multitask Learning for Machine Reading Comprehension, NAACL 2019☆102Sep 7, 2020Updated 5 years ago
- Dilate Gated Convolutional Neural Network For Machine Reading Comprehension☆39Aug 14, 2019Updated 6 years ago
- Unofficial Pytorch implementation of MiniLM and MiniLMv2☆23Jan 30, 2022Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang…☆89Dec 1, 2023Updated 2 years ago
- AIR retriever for Multi-Hop QA (ACL 2020 paper)☆30Jul 18, 2020Updated 5 years ago
- PyTorch implementations of algorithms for knowledge distillation.☆57Apr 24, 2020Updated 5 years ago
- Source code of K-BERT (AAAI2020)☆985Jan 27, 2023Updated 3 years ago
- Retrieve, Read, Rerank: Towards End-to-End Multi-Document Reading Comprehension☆104Sep 14, 2019Updated 6 years ago
- Scripts to train a bidirectional LSTM with knowledge distillation from BERT☆159Nov 21, 2019Updated 6 years ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Dec 30, 2021Updated 4 years ago
- Knowledge distillation in text classification with pytorch. 知识蒸馏,中文文本分类,教师模型BERT、XLNET,学生模型biLSTM。☆229Jul 27, 2022Updated 3 years ago
- tensorflow version of bert-of-theseus☆63Dec 11, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Lite Bert For Self-Supervised Learning Language Representations☆714May 13, 2020Updated 5 years ago
- BERT for Multitask Learning☆544Apr 12, 2023Updated 2 years ago
- Pre-Trained Chinese XLNet( 中文XLNet预训练模型)☆1,648Jul 15, 2025Updated 8 months ago
- Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER…☆119Jan 13, 2021Updated 5 years ago
- ☆50Jun 12, 2023Updated 2 years ago
- A video retrieval dataset How2R and a video QA dataset How2QA☆24Oct 15, 2020Updated 5 years ago
- ☆44Jul 29, 2019Updated 6 years ago