An official implementation of "BPE-Dropout: Simple and Effective Subword Regularization" algorithm.
☆53Feb 17, 2021Updated 5 years ago
Alternatives and similar repositories for BPE-Dropout
Users that are interested in BPE-Dropout are comparing it to the libraries listed below
Sorting:
- ☆22Oct 26, 2020Updated 5 years ago
- A repository for experiments in quality-aware decoding☆18Jun 7, 2022Updated 3 years ago
- Framework for neural-based Quality Estimation☆41Sep 23, 2020Updated 5 years ago
- HPYLMのC++実装☆11May 2, 2017Updated 8 years ago
- ☆13Sep 25, 2024Updated last year
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Jul 18, 2023Updated 2 years ago
- Codes for the paper Hierarchical Contextualized Representation for Named Entity Recognition☆51Jan 1, 2020Updated 6 years ago
- ☆29Apr 15, 2023Updated 2 years ago
- Code for NAACL 2022 main conference paper "Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation"☆12May 8, 2023Updated 2 years ago
- A simple, Python-based, command-line runner for MGIZA++.☆10Mar 24, 2022Updated 3 years ago
- ☆16May 14, 2024Updated last year
- System Combination☆16Aug 28, 2015Updated 10 years ago
- ☆17Apr 28, 2022Updated 3 years ago
- ☆22Jul 28, 2020Updated 5 years ago
- Code for the CIKM 2019 Paper: How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations☆32Jun 12, 2023Updated 2 years ago
- PhD thesis (updating) of Jiatao Gu from HKU☆19Aug 10, 2018Updated 7 years ago
- A diff tool for language models☆44Dec 28, 2023Updated 2 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆120Sep 26, 2021Updated 4 years ago
- Efficient, Extensible kNN-MT Framework☆19Sep 7, 2024Updated last year
- Rough codebase for exploring initialization strategies for new word embeddings in pretrained LMs☆19Dec 10, 2021Updated 4 years ago
- Japanese--Russian--English News Commentary Parallel Data☆18Jul 9, 2019Updated 6 years ago
- A library for minimum Bayes risk (MBR) decoding☆52Nov 2, 2025Updated 4 months ago
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Jul 16, 2022Updated 3 years ago
- Spoken Language Translation System☆20Jul 26, 2021Updated 4 years ago
- A reference implementation of algorithms for distributions over spanning trees.☆21Mar 10, 2020Updated 5 years ago
- A collection of all our phonemeizers for dataset construction and inference☆27Feb 21, 2025Updated last year
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- Parallel corpora for the biomedical domain☆50Jul 18, 2024Updated last year
- ☆93Feb 13, 2024Updated 2 years ago
- Unsupervised text tokenizer focused on computational efficiency☆975Mar 29, 2024Updated last year
- Question Answering as Global Reasoning over Semantic Abstractions (AAAI-18)☆33Jun 7, 2018Updated 7 years ago
- NAACL 2019 "Structured Minimally Supervised Learning for Neural Relation Extraction"☆21Feb 9, 2020Updated 6 years ago
- Repository for the Question Answering via Sentence Composition (QASC) dataset☆56Aug 2, 2023Updated 2 years ago
- ☆30May 20, 2022Updated 3 years ago
- Decoding platform for machine translation research☆54Aug 24, 2019Updated 6 years ago
- ☆25May 21, 2018Updated 7 years ago
- Code and Data for the paper Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References SIGdial 201…☆28Mar 6, 2020Updated 5 years ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆27Oct 20, 2022Updated 3 years ago
- Implementation of “Unsupervised Neural Machine Translation with SMT as Posterior Regularization” (AAAI 2019)☆31Mar 27, 2019Updated 6 years ago