matthew-cavener / my-bert-is-too-big
Doing Knowledge Distillation on BERT because the inference time is too damn high!
☆9Updated 5 years ago
Alternatives and similar repositories for my-bert-is-too-big:
Users that are interested in my-bert-is-too-big are comparing it to the libraries listed below
- NoiseMix - data generation for natural language☆40Updated 6 years ago
- Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation☆31Updated last year
- Code for our ACL '20 paper "Representation Engineering with Natural Language Explanations"☆29Updated 4 years ago
- ☆22Updated 3 years ago
- The accompanying code for "Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understandin…☆21Updated 5 years ago
- ☆20Updated 5 years ago
- ☆42Updated 5 years ago
- Boolean Question Answering with multi-task learning and uses large LM embeddings like BERT, RoBERTa☆18Updated 5 years ago
- Pre-training character n-gram embeddings☆22Updated last year
- ☆42Updated 4 years ago
- pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference☆62Updated 2 years ago
- Code for bidirectional sequence generation (BiSon) for generating from BERT pre-trained models.☆51Updated 5 years ago
- Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation☆63Updated 6 years ago
- Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"☆40Updated 6 years ago
- Stacked Denoising BERT for Noisy Text Classification (Neural Networks 2020)☆32Updated 2 years ago
- TextGraphs-13 Shared Task on Multi-Hop Inference Explanation Regeneration☆44Updated 5 years ago
- evaluation suite for testing automatic grammatical error corrections☆38Updated 7 years ago
- Hyperparameter search for AllenNLP - powered by Ray TUNE☆28Updated 2 weeks ago
- Tools for training pytorch language models☆27Updated 4 years ago
- source code of bison☆26Updated 4 years ago
- A framework for training and evaluating AI models on a variety of openly available dialogue datasets.☆36Updated 4 years ago
- ☆32Updated 5 years ago
- ☆47Updated 4 years ago
- A novel method of constrained decoding for neural NLG (NNLG) models☆83Updated 4 years ago
- Backtranslations of IMDB movie reviews for Data Augmentation Purposes☆11Updated 5 years ago
- Companion site for "Analysis Methods in Neural Language Processing: A Survey"☆66Updated 5 years ago
- Code for SIGDial 2019 Best Paper: Structured Fusion Networks for Dialog https://arxiv.org/abs/1907.10016☆31Updated 5 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Updated 3 years ago
- Code for papers "A Surprisingly Robust Trick for Winograd Schema Challenge" and "WikiCREM: A Large Unsupervised Corpus for Coreference Re…☆71Updated 2 years ago
- Code for the CIKM 2019 Paper: How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations☆31Updated last year