marian-nmt / marian-dev
Fast Neural Machine Translation in C++ - development repository
☆255Updated last month
Related projects: ⓘ
- Fast and customizable text tokenization library with BPE and SentencePiece support☆276Updated 2 weeks ago
- Bitextor generates translation memories from multilingual websites☆287Updated 3 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆148Updated 3 months ago
- A tool that locates, downloads, and extracts machine translation corpora☆145Updated 3 months ago
- Corpus preprocessing☆95Updated 6 months ago
- eXtensible Neural Machine Translation☆185Updated 4 years ago
- Examples, tutorials and use cases for Marian, including our WMT-2017/18 baselines.☆78Updated last year
- Fast BPE☆652Updated 3 months ago
- A word alignment tool based on famous GIZA++, extended to support multi-threading, resume training and incremental training.☆161Updated 3 years ago
- Fast Neural Machine Translation in C++☆1,225Updated last year
- Python port of Moses tokenizer, truecaser and normalizer☆486Updated 3 months ago
- Efficient Low-Memory Aligner☆135Updated 2 weeks ago
- scripts and configuration files for Edinburgh neural MT submission to WMT 16 shared translation task☆138Updated 3 years ago
- Improved Sentence Alignment in Linear Time and Space☆157Updated last year
- A neural word aligner based on multilingual BERT☆321Updated 2 years ago
- Simple, fast unsupervised word aligner☆732Updated 2 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆347Updated 10 months ago
- Efficient teacher-student models and scripts to make them☆48Updated 9 months ago
- A tool for holistic analysis of language generations systems☆465Updated 2 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆184Updated 7 years ago
- Sentence aligner☆106Updated 3 years ago
- Automatic extraction of edited sentences from text edition histories.☆80Updated 2 years ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆45Updated 4 months ago
- Unsupervised Statistical Machine Translation☆227Updated 4 years ago
- Open language modeling toolkit based on PyTorch☆47Updated this week
- ☆42Updated 6 years ago
- OpusFilter - Parallel corpus processing toolkit☆101Updated last month
- Open-Source Machine Translation Quality Estimation in PyTorch☆229Updated 2 years ago
- Appraise evaluation system for manual evaluation of machine translation output☆73Updated 3 years ago
- Collection of Evaluation Metrics and Algorithms for Machine Translation☆76Updated 6 years ago