Helsinki-NLP / Tatoeba-Challenge
☆822Updated 8 months ago
Alternatives and similar repositories for Tatoeba-Challenge:
Users that are interested in Tatoeba-Challenge are comparing it to the libraries listed below
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆223Updated 2 years ago
- Facebook Low Resource (FLoRes) MT Benchmark☆727Updated last year
- Open neural machine translation models and web services☆681Updated 4 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆157Updated 10 months ago
- Tools and Modeling Code for the MASSIVE dataset☆545Updated 2 years ago
- A neural word aligner based on multilingual BERT☆346Updated 3 years ago
- ☆1,271Updated 2 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆361Updated last year
- simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.☆394Updated last year
- Simple, fast unsupervised word aligner☆751Updated 2 years ago
- Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagg…☆920Updated 11 months ago
- BLEURT is a metric for Natural Language Generation based on transfer learning.☆727Updated last year
- Library for translating between 200 languages. Built on 🤗 transformers.☆479Updated 7 months ago
- Evaluating Cross-lingual Sentence Representations☆452Updated 3 years ago
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …☆231Updated 2 years ago
- Training open neural machine translation models☆357Updated last month
- NeuSpell: A Neural Spelling Correction Toolkit☆692Updated last year
- Bitextor generates translation memories from multilingual websites☆292Updated 5 months ago
- Crawl BookCorpus☆831Updated last year
- Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons☆1,124Updated last month
- Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive…☆430Updated last year
- ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.☆578Updated 2 years ago
- Fast Neural Machine Translation in C++☆1,316Updated last year
- GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors☆505Updated 5 years ago
- ☆503Updated last year
- EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"☆339Updated 5 months ago
- A tool for holistic analysis of language generations systems☆468Updated 3 years ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆272Updated 3 months ago
- ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.☆444Updated last year
- Python port of Moses tokenizer, truecaser and normalizer☆495Updated 10 months ago