A large parallel corpus of English and Japanese
☆90Nov 1, 2017Updated 8 years ago
Alternatives and similar repositories for JESC
Users that are interested in JESC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An example usage of JParaCrawl pre-trained Neural Machine Translation (NMT) models.☆105Apr 29, 2021Updated 5 years ago
- ☆22Aug 18, 2020Updated 5 years ago
- 50k English-Japanese Parallel Corpus for Machine Translation Benchmark.☆98Sep 11, 2019Updated 6 years ago
- ☆22Dec 20, 2019Updated 6 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆15Jul 30, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Aug 27, 2024Updated last year
- Scripts for creating a Japanese-English parallel corpus and training NMT models☆18Nov 9, 2021Updated 4 years ago
- The Business Scene Dialogue corpus☆74Nov 10, 2021Updated 4 years ago
- ☆16Aug 20, 2020Updated 5 years ago
- Decoding platform for machine translation research☆54Aug 24, 2019Updated 6 years ago
- Cynical data selection☆20Jan 16, 2021Updated 5 years ago
- A Neural Machine Translation implementation in Chainer☆46May 22, 2020Updated 5 years ago
- Bitextor generates translation memories from multilingual websites☆299Nov 11, 2024Updated last year
- ☆63Feb 28, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Meedan's Open Source Arabic/English Translation Memory☆33Nov 4, 2009Updated 16 years ago
- ☆43Sep 16, 2020Updated 5 years ago
- Practical example from Human-in-the-Loop Machine Learning book☆11Oct 28, 2021Updated 4 years ago
- Tools for extracting parallel corpora from article titles across languages in Wikipedia☆74Feb 25, 2015Updated 11 years ago
- ☆24Nov 29, 2017Updated 8 years ago
- Efficient teacher-student models and scripts to make them☆57Dec 16, 2023Updated 2 years ago
- Efficient Markov Chain word alignment☆53Aug 1, 2021Updated 4 years ago
- Korean Parallel Corpus☆147Feb 24, 2024Updated 2 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- MT Evaluation in Many Languages via Zero-Shot Paraphrasing☆102Jul 25, 2024Updated last year
- Neural macine translation soft alignment visualisations for web and command line☆73Aug 19, 2021Updated 4 years ago
- Kyoto University Web Document Leads Corpus☆84Dec 18, 2023Updated 2 years ago
- Lexically Constrained Neural Machine Translation with Levenshtein Transformer☆40Jul 14, 2020Updated 5 years ago
- My implementation of LASER architecture in Fairseq☆12Oct 6, 2020Updated 5 years ago
- ☆22Oct 26, 2020Updated 5 years ago
- ☆23Apr 19, 2026Updated 2 weeks ago
- Modified version of fairseq, including new implementations for criterions using reinforcement learning methods.☆11Aug 14, 2019Updated 6 years ago
- Yet another sentence-level tokenizer for the Japanese text☆24Nov 27, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Hadoop-based tool for extraction of large scale synchronous grammars for paraphrasing and machine translation☆15Dec 2, 2016Updated 9 years ago
- ☆15Nov 5, 2020Updated 5 years ago
- CaboCha wrapper for Python3☆46Jul 5, 2018Updated 7 years ago
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆38May 1, 2025Updated last year
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons☆1,240Jan 12, 2026Updated 3 months ago
- A summarizer for Japanese articles (but ChatGPT is better)☆10Aug 1, 2022Updated 3 years ago