☆42Jul 17, 2018Updated 7 years ago
Alternatives and similar repositories for zipporah
Users that are interested in zipporah are comparing it to the libraries listed below
Sorting:
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Simple LSTM language modelling toolkit☆10Oct 21, 2022Updated 3 years ago
- Tool for manual evaluation of parallel sentences.☆15Jan 26, 2026Updated last month
- Data collection, alignment and TAUS repository☆23Nov 30, 2017Updated 8 years ago
- XenC: open-source data selection tool for NLP☆64Mar 21, 2016Updated 9 years ago
- Reproduction instructions for "Rapid Adaptation of Neural Machine Translation to New Languages"☆39Aug 7, 2018Updated 7 years ago
- Scripts to preprocess training and test data and to run fast_align and giza☆107Nov 2, 2021Updated 4 years ago
- Efficient Low-Memory Aligner☆146Jan 15, 2025Updated last year
- A handy dataset of noises for ASR☆22May 29, 2019Updated 6 years ago
- Efficient Markov Chain word alignment☆53Aug 1, 2021Updated 4 years ago
- Bitextor generates translation memories from multilingual websites☆302Nov 11, 2024Updated last year
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.☆41Dec 19, 2023Updated 2 years ago
- A python implementation of the neural network joint language model and an extension of it using global source context.☆11May 17, 2017Updated 8 years ago
- Convert words to numbers☆21Apr 13, 2022Updated 3 years ago
- ☆24Nov 29, 2017Updated 8 years ago
- Appraise evaluation system for manual evaluation of machine translation output☆77May 7, 2021Updated 4 years ago
- YiSi: A Semantic Machine Translation Evaluation Metric for Evaluating Languages with Different Levels of Available Resources☆26May 28, 2019Updated 6 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- recent audio generation papers (including speech, music and general audios)☆13Mar 14, 2023Updated 2 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- Tool to fix bitexts and tag near-duplicates for removal