Edy-Barraza / Transformer_Distillation
Knowledge Distillation For Transformer Language Models
☆52Updated last year
Alternatives and similar repositories for Transformer_Distillation:
Users that are interested in Transformer_Distillation are comparing it to the libraries listed below
- dstc7-noesis☆46Updated 5 years ago
- Record papers for some NLP related area☆24Updated 2 years ago
- ☆46Updated 6 months ago
- Distilling BERT using natural language generation.☆36Updated last year
- ☆59Updated 5 years ago
- Multiple Different Natural Language Processing Tasks in a Single Deep Model☆48Updated 6 years ago
- MAsked Sequence to Sequence (MASS) pre-training for language generation☆21Updated 5 years ago
- Tensorflow implementation of Bi-directional RNN Langauge Model☆38Updated 6 years ago
- Source code for the paper "Multilingual Neural Machine Translation with Soft Decoupled Encoding"☆29Updated 3 years ago
- Natural Language Generation by Hierarchical Decoding with Linguistic Patterns (NAACL-HLT 2018), Investigating Linguistic Pattern Ordering…☆32Updated 6 years ago
- PyTorch port of BERT ML model☆16Updated 6 years ago
- ICLR2019, Multilingual Neural Machine Translation with Knowledge Distillation☆70Updated 4 years ago
- Source code for "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation"☆18Updated 5 years ago
- ☆47Updated 4 years ago
- ☆83Updated 4 years ago
- This is the code in <Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets> which has been accep…☆34Updated last year
- This is the code for "Learning Sentiment Memories for Sentiment Modification without Parallel Data".☆54Updated 6 years ago
- This is the PyTorch implementation of the ACL 2019 paper RankQA: Neural Question Answering with Answer Re-Ranking.☆83Updated 3 years ago
- Re-rank n-best lists using additional features.☆28Updated 6 years ago
- Boolean Question Answering with multi-task learning and uses large LM embeddings like BERT, RoBERTa☆18Updated 5 years ago
- BERT Extension in TensorFlow☆30Updated 5 years ago
- An implementation of "Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems"☆14Updated 5 years ago
- Ordered Neurons LSTM☆30Updated 3 years ago
- NAACL'19: "Jointly Optimizing Diversity and Relevance in Neural Response Generation"☆74Updated 4 years ago
- Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.☆51Updated 2 years ago
- ☆33Updated 5 years ago
- Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation☆22Updated 4 years ago
- ☆24Updated 4 years ago
- Code for EMNLP 2018 paper https://arxiv.org/pdf/1808.09075.pdf☆38Updated 6 years ago
- A parser of the Multi-Domain Wizard-of-Oz dataset (MultiWOZ)☆67Updated 6 years ago