pvgladkov / knowledge-distillation
PyTorch implementations of algorithms for knowledge distillation.
☆57Updated 4 years ago
Alternatives and similar repositories for knowledge-distillation:
Users that are interested in knowledge-distillation are comparing it to the libraries listed below
- Knowledge Distillation from BERT☆52Updated 6 years ago
- CIKM 2020: Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots☆74Updated 4 years ago
- SpanNER: Named EntityRe-/Recognition as Span Prediction☆124Updated 2 years ago
- ☆42Updated 4 years ago
- For the code release of our arXiv paper "Revisiting Few-sample BERT Fine-tuning" (https://arxiv.org/abs/2006.05987).☆184Updated last year
- Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.☆72Updated 3 years ago
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆132Updated last year
- Selections from EMNLP 2020☆60Updated 3 years ago
- ☆25Updated 4 years ago
- ☆50Updated last year
- Source code for our "TitleStylist" paper at ACL 2020☆76Updated 6 months ago
- ☆66Updated 2 years ago
- A PyTorch implementation of "Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation"☆56Updated 4 years ago
- Implementation of Self-adjusting Dice Loss from "Dice Loss for Data-imbalanced NLP Tasks" paper☆107Updated 4 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆91Updated 3 years ago
- Library of various Few-Shot Learning frameworks for text classification☆60Updated 2 years ago
- The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach …☆62Updated 4 years ago
- Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".☆131Updated 3 years ago
- Few-shot binary text classification with Induction Networks and Word2Vec weights initialization☆108Updated 6 months ago
- Code for the paper Non-Autoregressive Dialog State Tracking (ICLR20)☆45Updated 4 years ago
- ☆66Updated 3 years ago
- ☆41Updated 3 years ago
- Pre-processing and in some cases downloading of datasets for the paper "Content Selection in Deep Learning Models of Summarization."☆78Updated 2 years ago
- Code for the RecAdam paper: Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting.☆115Updated 4 years ago
- Intent Detection and Slot Filling☆37Updated last year
- EMNLP'19: Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling☆77Updated last year
- Code for the paper "Efficient Adaption of Pretrained Transformers for Abstractive Summarization"☆71Updated 5 years ago
- AAAI-2021 paper: Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders.☆38Updated 3 years ago
- Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling (ACL-2020)☆77Updated 4 years ago
- ☆28Updated 5 years ago