tacchinotacchi / distil-bilstm
Scripts to train a bidirectional LSTM with knowledge distillation from BERT
☆158Updated 5 years ago
Alternatives and similar repositories for distil-bilstm:
Users that are interested in distil-bilstm are comparing it to the libraries listed below
- XLNet: fine tuning on RTX 2080 GPU - 8 GB☆154Updated 5 years ago
- Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs☆107Updated 2 years ago
- ☆58Updated 5 years ago
- Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)☆226Updated 4 years ago
- We summarize the summarization papers presented at major conferences (starting with ACL 2019)☆85Updated 5 years ago
- Implementation of ULMFit algorithm for text classification via transfer learning☆94Updated 6 years ago
- LM, ULMFit et al.☆46Updated 5 years ago
- Authors' implementation of EMNLP-IJCNLP 2019 paper "Answering Complex Open-domain Questions Through Iterative Query Generation"☆195Updated 5 years ago
- XLNet for generating language.☆165Updated 4 years ago
- Reproducing Character-Level-Language-Modeling with Deeper Self-Attention in PyTorch☆61Updated 6 years ago
- A PyTorch implementation of a Bi-LSTM CRF with character-level features☆63Updated 6 years ago
- LM Pretraining with PyTorch/TPU☆134Updated 5 years ago
- PyTorch DataLoader for seq2seq☆85Updated 6 years ago
- Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"☆75Updated 6 years ago
- The Annotated Encoder Decoder with Attention☆166Updated 4 years ago
- A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and T…☆210Updated 3 years ago
- Easy to use NLP library built on PyTorch and TorchText☆255Updated 5 years ago
- Pytorch and Torchtext implementation of Sequence to sequence☆59Updated 7 years ago
- A set of tutorials for torchtext☆186Updated 6 years ago
- Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning☆311Updated 4 years ago
- Variational Methods for Pretraining in Resource-limited Environments☆174Updated 4 years ago
- Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Process…☆250Updated 6 years ago
- Efficient Transformers for research, PyTorch and Tensorflow using Locality Sensitive Hashing☆94Updated 5 years ago
- Pre-training of Language Models for Language Understanding☆83Updated 5 years ago
- Comparing Text Classification results using BERT embedding and ULMFIT embedding☆65Updated 6 years ago
- Fine Tuning Language Models for Multilabel Prediction☆61Updated 2 years ago
- Re-implementation of ELMo on Keras☆134Updated 2 years ago
- Exploring Random Encoders for Sentence Classification☆183Updated 5 years ago
- Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)☆81Updated 2 years ago
- Dynamic Meta-Embeddings for Improved Sentence Representations☆331Updated 4 years ago