VProv / BPE-Dropout
An official implementation of "BPE-Dropout: Simple and Effective Subword Regularization" algorithm.
☆50Updated 4 years ago
Alternatives and similar repositories for BPE-Dropout:
Users that are interested in BPE-Dropout are comparing it to the libraries listed below
- A repository for experiments in quality-aware decoding☆16Updated 2 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- ☆21Updated 2 years ago
- This repositary hosts my experiments for the project, I did with OffNote Labs.☆10Updated 4 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆71Updated last year
- A library for minimum Bayes risk (MBR) decoding☆37Updated last week
- ☆22Updated 4 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆58Updated 4 years ago
- Repository for Findings of EMNLP 2020 "Context-aware Stand-alone Neural Spelling Correction"☆18Updated 4 years ago
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorch☆75Updated 4 years ago
- ☆41Updated 4 years ago
- ☆51Updated 4 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆30Updated 3 years ago
- ☆97Updated 2 years ago
- ☆92Updated last year
- Code for AAAI 2021 paper "Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance"☆25Updated 2 years ago
- [TMLR'23] Contrastive Search Is What You Need For Neural Text Generation☆119Updated 2 years ago
- Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."☆139Updated 2 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆55Updated 2 years ago
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆72Updated 3 years ago
- PyTorch reimplementation of REALM and ORQA☆22Updated 3 years ago
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆96Updated 2 years ago
- ☆44Updated 3 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆136Updated last year
- ☆12Updated last year
- English-French MT dialogue dataset☆17Updated 2 years ago
- Code for NeurIPS2020 "Incorporating BERT into Parallel Sequence Decoding with Adapters"☆32Updated 2 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Updated 3 years ago
- This is a repository for the paper on testing inductive bias with scaled-down RoBERTa models.☆20Updated 3 years ago
- ☆99Updated 2 years ago