Helsinki-NLP / OPUS-MT-train
Training open neural machine translation models
☆346Updated 5 months ago
Alternatives and similar repositories for OPUS-MT-train:
Users that are interested in OPUS-MT-train are comparing it to the libraries listed below
- Open neural machine translation models and web services☆640Updated last month
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆263Updated this week
- Facebook Low Resource (FLoRes) MT Benchmark☆717Updated last year
- Easy to use, state-of-the-art Neural Machine Translation for 100+ languages☆1,195Updated last year
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆152Updated 7 months ago
- Bitextor generates translation memories from multilingual websites☆293Updated 2 months ago
- The FLORES+ Machine Translation Benchmark☆100Updated 2 months ago
- A neural word aligner based on multilingual BERT☆336Updated 2 years ago
- A tool that locates, downloads, and extracts machine translation corpora☆149Updated 7 months ago
- Multilingual sentence alignment using sentence embeddings☆106Updated 2 months ago
- Open information and community for machine translation☆72Updated last month
- Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.☆155Updated 9 months ago
- BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages☆221Updated last year
- Library for translating between 200 languages. Built on 🤗 transformers.☆467Updated 4 months ago
- Improved Sentence Alignment in Linear Time and Space☆163Updated last year
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆159Updated 3 months ago
- Fast Neural Machine Translation in C++☆1,275Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆234Updated 2 years ago
- ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.☆571Updated last year
- OpusFilter - Parallel corpus processing toolkit☆104Updated this week
- Training scripts for Argos Translate☆127Updated 2 months ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆353Updated last year
- NeuSpell: A Neural Spelling Correction Toolkit☆681Updated last year
- CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed)☆359Updated 3 years ago
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆826Updated 4 months ago
- Python port of Moses tokenizer, truecaser and normalizer☆490Updated 7 months ago
- Easier Automatic Sentence Simplification Evaluation☆160Updated last year
- Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible f…☆196Updated 2 months ago
- ☆489Updated 11 months ago
- Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.☆818Updated last month