Helsinki-NLP / Tatoeba-Challenge
☆796Updated 3 weeks ago
Related projects: ⓘ
- Facebook Low Resource (FLoRes) MT Benchmark☆686Updated 9 months ago
- Open neural machine translation models and web services☆598Updated 2 months ago
- Crawl BookCorpus☆799Updated last year
- ☆1,240Updated last year
- Fast Neural Machine Translation in C++☆1,225Updated last year
- Training open neural machine translation models☆321Updated last month
- Bitextor generates translation memories from multilingual websites☆287Updated 3 months ago
- ☆480Updated 7 months ago
- A neural word aligner based on multilingual BERT☆321Updated 2 years ago
- ☆1,470Updated last year
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆347Updated 10 months ago
- Simple, fast unsupervised word aligner☆732Updated 2 years ago
- Library for translating between 200 languages. Built on 🤗 transformers.☆437Updated 2 weeks ago
- NeuSpell: A Neural Spelling Correction Toolkit☆662Updated last year
- GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors☆482Updated 4 years ago
- Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing☆725Updated 5 months ago
- Language-Agnostic SEntence Representations☆3,576Updated 4 months ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆432Updated 2 years ago
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…☆629Updated last year
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆220Updated last year
- BLEURT is a metric for Natural Language Generation based on transfer learning.☆685Updated last year
- Easy to use, state-of-the-art Neural Machine Translation for 100+ languages☆1,146Updated 8 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆148Updated 3 months ago
- Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive…☆428Updated last year
- ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.☆561Updated last year
- Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons☆1,040Updated last month
- Python port of Moses tokenizer, truecaser and normalizer☆486Updated 3 months ago
- Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.☆251Updated last year
- 📃Language Model based sentences scoring library☆300Updated 2 years ago
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆152Updated 2 years ago