rpryzant/JESC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rpryzant/JESC)

rpryzant / JESC

A large parallel corpus of English and Japanese

☆90

Alternatives and similar repositories for JESC

Users that are interested in JESC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MorinoseiMorizo / jparacrawl-finetune
View on GitHub
An example usage of JParaCrawl pre-trained Neural Machine Translation (NMT) models.
☆105Apr 29, 2021Updated 5 years ago
venali / BilingualCorpus
View on GitHub
☆22Aug 18, 2020Updated 5 years ago
odashi / small_parallel_enja
View on GitHub
50k English-Japanese Parallel Corpus for Machine Translation Benchmark.
☆97Sep 11, 2019Updated 6 years ago
transducens / LASERtrain
View on GitHub
☆22Dec 20, 2019Updated 6 years ago
SAP / software-documentation-data-set-for-machine-translation
View on GitHub
A parallel evaluation data set of SAP software documentation with document structure annotation
☆15Jun 12, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ikegami-yukino / zunda-python
View on GitHub
Zunda: Japanese Enhanced Modality Analyzer client for Python.
☆10Nov 30, 2019Updated 6 years ago
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
laboroai / Laboro-ParaCorpus
View on GitHub
Scripts for creating a Japanese-English parallel corpus and training NMT models
☆19Nov 9, 2021Updated 4 years ago
ucam-smt / sgnmt
View on GitHub
Decoding platform for machine translation research
☆54Aug 24, 2019Updated 6 years ago
rycolab / bfbs
View on GitHub
☆16Aug 20, 2020Updated 5 years ago
amittai / cynical
View on GitHub
Cynical data selection
☆20Jan 16, 2021Updated 5 years ago
s-taka / fugumt
View on GitHub
☆63Feb 28, 2021Updated 5 years ago
bitextor / bitextor
View on GitHub
Bitextor generates translation memories from multilingual websites
☆299Nov 11, 2024Updated last year
rmunro / headlines
View on GitHub
Practical example from Human-in-the-Loop Machine Learning book
☆11Oct 28, 2021Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
jungokasai / twist_decoding
View on GitHub
☆30May 20, 2022Updated 4 years ago
paracrawl / extractor
View on GitHub
☆24Nov 29, 2017Updated 8 years ago
robertostling / efmaral
View on GitHub
Efficient Markov Chain word alignment
☆53Aug 1, 2021Updated 4 years ago
browsermt / students
View on GitHub
Efficient teacher-student models and scripts to make them
☆57Dec 16, 2023Updated 2 years ago
raymondhs / fairseq-laser
View on GitHub
My implementation of LASER architecture in Fairseq
☆12Oct 6, 2020Updated 5 years ago
xmyunmai / OCR_APP_IDcard
View on GitHub
1.身份证识别，可以拍照或导入身份证图片进行识别
☆13Sep 27, 2022Updated 3 years ago
xmyunmai / OCR_APP_DL
View on GitHub
1.驾照识别，可以拍照或导入驾照图片进行识别
☆11Sep 27, 2022Updated 3 years ago
bitextor / bicleaner
View on GitHub
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
☆160Jun 18, 2024Updated 2 years ago
jungyeul / korean-parallel-corpora
View on GitHub
Korean Parallel Corpus
☆147Feb 24, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ku-nlp / KWDLC
View on GitHub
Kyoto University Web Document Leads Corpus
☆84Dec 18, 2023Updated 2 years ago
thompsonb / prism
View on GitHub
MT Evaluation in Many Languages via Zero-Shot Paraphrasing
☆102Jul 25, 2024Updated 2 years ago
ikegami-yukino / sengiri
View on GitHub
Yet another sentence-level tokenizer for the Japanese text
☆24Nov 27, 2025Updated 8 months ago
M4t1ss / SoftAlignments
View on GitHub
Neural macine translation soft alignment visualisations for web and command line
☆73Aug 19, 2021Updated 4 years ago
raymondhs / constrained-levt
View on GitHub
Lexically Constrained Neural Machine Translation with Levenshtein Transformer
☆40Jul 14, 2020Updated 6 years ago
xlhex / dpe
View on GitHub
☆22Oct 26, 2020Updated 5 years ago
kenkov / cabocha
View on GitHub
CaboCha wrapper for Python3
☆46Jul 5, 2018Updated 8 years ago
TianchunH97 / fairseq-rl
View on GitHub
Modified version of fairseq, including new implementations for criterions using reinforcement learning methods.
☆11Aug 14, 2019Updated 6 years ago
joshua-decoder / thrax
View on GitHub
Hadoop-based tool for extraction of large scale synchronous grammars for paraphrasing and machine translation
☆15Dec 2, 2016Updated 9 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
google / wmt19-paraphrased-references
View on GitHub
☆15Nov 5, 2020Updated 5 years ago
cl-tohoku / keigo_transfer_task
View on GitHub
敬語変換タスクにおける評価用データセット
☆21Nov 24, 2022Updated 3 years ago
mjpost / sacrebleu
View on GitHub
Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
☆1,254Jul 17, 2026Updated last week
ikegami-yukino / dataset-list
View on GitHub
lists of text corpus and more (mainly Japanese)
☆119Jul 25, 2024Updated 2 years ago
skozawa / Comainu
View on GitHub
COrpus based Morphological Analyzer with INtegrated User dictionary
☆21Mar 30, 2025Updated last year
tsuruoka-lab / AMI-Meeting-Parallel-Corpus
View on GitHub
AMI Meeting Parallel Corpus
☆13Dec 11, 2020Updated 5 years ago
modernmt / DataCollection
View on GitHub
Data collection, alignment and TAUS repository
☆24Nov 30, 2017Updated 8 years ago