tacchinotacchi / distil-bilstm
Scripts to train a bidirectional LSTM with knowledge distillation from BERT
☆158Updated 5 years ago
Alternatives and similar repositories for distil-bilstm:
Users that are interested in distil-bilstm are comparing it to the libraries listed below
- Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs☆106Updated 2 years ago
- XLNet: fine tuning on RTX 2080 GPU - 8 GB☆154Updated 5 years ago
- ☆58Updated 5 years ago
- Comparing Text Classification results using BERT embedding and ULMFIT embedding☆65Updated 5 years ago
- Implementation of ULMFit algorithm for text classification via transfer learning☆95Updated 5 years ago
- LM, ULMFit et al.☆47Updated 5 years ago
- Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)☆226Updated 3 years ago
- XLNet for generating language.☆165Updated 3 years ago
- BertQA - Attention on Steroids☆115Updated 2 years ago
- An Attention Layer in Keras☆43Updated 5 years ago
- Re-implementation of ELMo on Keras☆135Updated last year
- Variational Methods for Pretraining in Resource-limited Environments☆174Updated 4 years ago
- Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2☆64Updated 2 years ago
- Transformer-XL with checkpoint loader☆68Updated 2 years ago
- Python package for understanding the difficulty of text classification datasets. (in CoNNL 2018)☆63Updated 3 years ago
- Fine Tuning Language Models for Multilabel Prediction☆61Updated 2 years ago
- Efficient Transformers for research, PyTorch and Tensorflow using Locality Sensitive Hashing☆93Updated 4 years ago
- A set of tutorials for torchtext☆186Updated 5 years ago
- Reproducing Character-Level-Language-Modeling with Deeper Self-Attention in PyTorch☆61Updated 6 years ago
- Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"☆76Updated 5 years ago
- Beam search for neural network sequence to sequence (encoder-decoder) models.☆34Updated 5 years ago
- State of the Art results in Intent Classification using Sematic Hashing for three datasets: AskUbuntu, Chatbot and WebApplication.☆134Updated 4 years ago
- Sentence Embeddings in NLI with Iterative Refinement Encoders☆79Updated 2 years ago
- Multilingual hierarchical attention networks toolkit☆78Updated 5 years ago
- Text Generation Using A Variational Autoencoder☆111Updated 7 years ago
- Sequence to Sequence Models in PyTorch☆44Updated 5 months ago
- Fine-tune BERT to generate sentence embedding for cosine similarity☆70Updated 5 years ago
- Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)☆82Updated 2 years ago