An official implementation of "BPE-Dropout: Simple and Effective Subword Regularization" algorithm.
☆54Feb 17, 2021Updated 5 years ago
Alternatives and similar repositories for BPE-Dropout
Users that are interested in BPE-Dropout are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆22Oct 26, 2020Updated 5 years ago
- A repository for experiments in quality-aware decoding☆18Jun 7, 2022Updated 4 years ago
- Spoken Language Translation System☆20Jul 26, 2021Updated 4 years ago
- ☆29Apr 15, 2023Updated 3 years ago
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Jul 18, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- An example of DyNet autobatching for the NIPS "how to code a paper" workshop☆12Dec 9, 2017Updated 8 years ago
- ☆16May 14, 2024Updated 2 years ago
- ☆13Sep 25, 2024Updated last year
- Efficient, Extensible kNN-MT Framework☆19Sep 7, 2024Updated last year
- ☆14Aug 9, 2021Updated 4 years ago
- ☆12Aug 9, 2021Updated 4 years ago
- ☆17Apr 28, 2022Updated 4 years ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated 2 years ago
- Codes for the paper Hierarchical Contextualized Representation for Named Entity Recognition☆51Jan 1, 2020Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Feb 16, 2023Updated 3 years ago
- Japanese--Russian--English News Commentary Parallel Data☆18Jul 9, 2019Updated 6 years ago
- Code for NAACL 2022 main conference paper "Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation"☆12May 8, 2023Updated 3 years ago
- Unsupervised text tokenizer focused on computational efficiency☆979Mar 29, 2024Updated 2 years ago
- An official implementation of the paper "Addressing Segmentation Ambiguity in Neural Linguistic Steganography"☆14Nov 12, 2022Updated 3 years ago
- System Combination☆16Aug 28, 2015Updated 10 years ago
- ☆16Aug 1, 2025Updated 10 months ago
- Python port of Moses tokenizer, truecaser and normalizer☆495Feb 6, 2026Updated 4 months ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars☆17Jun 14, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This repository contains a fine-tuning script for the transcription task of Mistral's Voxtral model.☆27Jul 31, 2025Updated 10 months ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆22Jun 28, 2024Updated last year
- ☆13Dec 11, 2020Updated 5 years ago
- ☆13Oct 17, 2020Updated 5 years ago
- A library for minimum Bayes risk (MBR) decoding☆52Nov 2, 2025Updated 7 months ago
- Code for Controlling Hallucinations at Word Level in Data-to-Text Generation (C. Rebuffel, M. Roberti, L. Soulier, G. Scoutheeten, R. Can…☆16Jun 12, 2023Updated 3 years ago
- Code for the CIKM 2019 Paper: How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations☆32Jun 12, 2023Updated 3 years ago
- Parallel corpora for the biomedical domain☆51Mar 27, 2026Updated 2 months ago
- Conformer: Convolution-augmented Transformer for Speech Recognition☆15Sep 4, 2025Updated 9 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Conformer RNN-Transducer☆14May 25, 2022Updated 4 years ago
- Machine Translation Evaluation Metric☆39Dec 6, 2017Updated 8 years ago
- A collection of basic text processing modules focused on Gujarati☆10Oct 24, 2017Updated 8 years ago
- A Controllable Model of Grounded Response Generation (AAAI 21)☆13Oct 25, 2022Updated 3 years ago
- ☆93Feb 13, 2024Updated 2 years ago
- The implementation of "Learning Deep Transformer Models for Machine Translation"☆116Jul 25, 2024Updated last year
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆27Oct 20, 2022Updated 3 years ago