luismsgomes / mosestokenizer
☆20Updated 3 years ago
Alternatives and similar repositories for mosestokenizer:
Users that are interested in mosestokenizer are comparing it to the libraries listed below
- Statistics on multilingual datasets☆17Updated 2 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated last month
- ☆28Updated 11 months ago
- Pretraining scripts for BART transformer model☆11Updated last year
- This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".☆79Updated 3 years ago
- We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scal…☆81Updated 3 years ago
- Lexically Constrained Neural Machine Translation with Levenshtein Transformer☆39Updated 4 years ago
- ☆23Updated last year
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Updated 2 years ago
- A library of translation-based text similarity measures☆25Updated last year
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated last year
- Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation☆17Updated 4 years ago
- ☆36Updated 3 years ago
- Code for our EACL-2021 paper "Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs".☆39Updated 10 months ago
- Python package to augment multilingual data☆14Updated 2 years ago
- ☆16Updated 4 years ago
- Zero-shot Transfer Learning from English to Arabic☆29Updated 2 years ago
- Multilingual abstractive summarization dataset extracted from WikiHow.☆91Updated last month
- Code and data for the NAACL 2021 paper: "XFORMAL: A Benchmark for Multilingual Formality Style Transfer"☆12Updated 3 years ago
- ☆21Updated 2 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated 3 months ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- ParCourE - Parallel Corpus Explorer☆12Updated 3 years ago
- PyTorch code for "FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization" (NAACL 2022)☆39Updated 2 years ago
- ☆33Updated 4 years ago
- Scripts for document-level grammatical error correction.☆18Updated 4 years ago
- Multilingual Quality Estimation and Automatic Post-editing Dataset☆41Updated 3 years ago
- ☆24Updated 2 years ago
- a tool for calcualting character n-gram F score☆72Updated 2 years ago