microsoft / fastseq
An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/pdf/2106.04718.pdf
☆431Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for fastseq
- FastFormers - highly efficient transformer models for NLU☆701Updated 10 months ago
- Repository containing code for "How to Train BERT with an Academic Budget" paper☆309Updated last year
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆363Updated 2 years ago
- Fast BPE☆656Updated 5 months ago
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…☆631Updated last year
- Repository for the paper "Optimal Subarchitecture Extraction for BERT"☆470Updated 2 years ago
- [ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.o…☆605Updated 2 years ago
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆325Updated 10 months ago
- MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf☆288Updated 3 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).☆312Updated last year
- ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.☆565Updated last year
- ICML'2022: NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework☆257Updated 10 months ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆153Updated 11 months ago
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …☆228Updated last year
- Code to reproduce experiments in the paper "Task-Oriented Dialogue as Dataflow Synthesis" (TACL 2020).☆308Updated 6 months ago
- Understanding the Difficulty of Training Transformers☆328Updated 2 years ago
- Code associated with the Don't Stop Pretraining ACL 2020 paper☆526Updated 3 years ago
- ☆246Updated 2 years ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆281Updated last year
- Interpretable Evaluation for AI Systems☆361Updated last year
- A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks☆363Updated last year
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- ☆487Updated 9 months ago
- Neural Text Generation with Unlikelihood Training☆310Updated 3 years ago
- Interpretable Evaluation for (Almost) All NLP Tasks☆193Updated 2 years ago
- ☆344Updated 3 years ago
- A tool for holistic analysis of language generations systems☆467Updated 2 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆119Updated 3 years ago
- Efficient, check-pointed data loading for deep learning with massive data sets.☆205Updated last year
- Code and Data for ACL 2020 paper "Few-Shot NLG with Pre-Trained Language Model"☆189Updated last year