MorinoseiMorizo/jparacrawl-finetune

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MorinoseiMorizo/jparacrawl-finetune)

MorinoseiMorizo / jparacrawl-finetune

An example usage of JParaCrawl pre-trained Neural Machine Translation (NMT) models.

☆105

Alternatives and similar repositories for jparacrawl-finetune

Users that are interested in jparacrawl-finetune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

laboroai / Laboro-ParaCorpus
View on GitHub
Scripts for creating a Japanese-English parallel corpus and training NMT models
☆19Nov 9, 2021Updated 4 years ago
rpryzant / JESC
View on GitHub
A large parallel corpus of English and Japanese
☆90Nov 1, 2017Updated 8 years ago
nttcslab-nlp / word_align
View on GitHub
A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT
☆26Jan 27, 2021Updated 5 years ago
tsuruoka-lab / AMI-Meeting-Parallel-Corpus
View on GitHub
AMI Meeting Parallel Corpus
☆13Dec 11, 2020Updated 5 years ago
cl-tohoku / PheMT
View on GitHub
A phenomenon-wise evaluation dataset for Japanese-English machine translation robustness. The dataset is based on the MTNT dataset, with …
☆19Feb 18, 2021Updated 5 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
teaspn / teaspn-server
View on GitHub
A sample implementation of the TEASPN server
☆18Oct 31, 2019Updated 6 years ago
odashi / small_parallel_enja
View on GitHub
50k English-Japanese Parallel Corpus for Machine Translation Benchmark.
☆97Sep 11, 2019Updated 6 years ago
M4t1ss / parallel-corpora-tools
View on GitHub
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
☆42Dec 19, 2023Updated 2 years ago
thammegowda / mtdata
View on GitHub
A tool that locates, downloads, and extracts machine translation corpora
☆166Apr 13, 2026Updated 3 months ago
mjpost / sacrebleu
View on GitHub
Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
☆1,253Jul 17, 2026Updated last week
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
Helsinki-NLP / OpusFilter
View on GitHub
OpusFilter - Parallel corpus processing toolkit
☆115Jul 1, 2026Updated 3 weeks ago
agesmundo / HadoopPerceptron
View on GitHub
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36266.pdf
☆14Apr 25, 2012Updated 14 years ago
octanove / shiba
View on GitHub
Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.
☆89Nov 3, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
aistairc / trf
View on GitHub
This is the repository for TRF (text readability features) publication.
☆37Aug 27, 2019Updated 6 years ago
cl-tohoku / bert-japanese
View on GitHub
BERT models for Japanese text.
☆548Mar 23, 2024Updated 2 years ago
nttcslab-nlp / RIBES
View on GitHub
RIBES is an automatic evaluation metric for machine translation.
☆13Sep 7, 2017Updated 8 years ago
shamilcm / pedra
View on GitHub
Post-editing Datasets by Rakuten (PEDRa)
☆14Jun 23, 2021Updated 5 years ago
tsuruoka-lab / BSD
View on GitHub
The Business Scene Dialogue corpus
☆75Nov 10, 2021Updated 4 years ago
odashi / mteval
View on GitHub
Collection of Evaluation Metrics and Algorithms for Machine Translation
☆76Mar 5, 2018Updated 8 years ago
CLARIN-PL / Inforex
View on GitHub
Inforex is a web system for text corpora construction.
☆12Jun 24, 2026Updated last month
mlpnlp / mlpnlp-nmt
View on GitHub
This is a sample code of "LSTM encoder-decoder with attention mechanism" mainly for understanding a recently developed machine translatio…
☆44Mar 14, 2019Updated 7 years ago
bitextor / bitextor
View on GitHub
Bitextor generates translation memories from multilingual websites
☆299Nov 11, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
matasuke / NAIST_essay
View on GitHub
☆12Dec 18, 2018Updated 7 years ago
neubig / nmt-tips
View on GitHub
A tutorial about neural machine translation including tips on building practical systems
☆369Nov 16, 2016Updated 9 years ago
mingruimingrui / fast-mosestokenizer
View on GitHub
c++ mosestokenizer
☆18Mar 13, 2024Updated 2 years ago
MicrosoftTranslator / MSLT-Corpus
View on GitHub
Microsoft Speech Language Translation (MSLT) Corpus
☆19Sep 18, 2017Updated 8 years ago
megagonlabs / ginza
View on GitHub
A Japanese NLP Library using spaCy as framework based on Universal Dependencies
☆862Jul 10, 2026Updated 2 weeks ago
bitextor / bicleaner
View on GitHub
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
☆160Jun 18, 2024Updated 2 years ago
chemicaltree / tetra
View on GitHub
☆10Sep 14, 2022Updated 3 years ago
icoxfog417 / acl-anthology
View on GitHub
Script to get ACL Anthology
☆16Jan 2, 2025Updated last year
amazon-science / doc-mt-metrics
View on GitHub
☆29Jul 30, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
browsermt / students
View on GitHub
Efficient teacher-student models and scripts to make them
☆57Dec 16, 2023Updated 2 years ago
hiroshi-manabe / japanese_verb_adj_list
View on GitHub
A list of Japanese verbs and adjectives.
☆23Oct 1, 2025Updated 9 months ago
BandaiNamcoResearchInc / DistilBERT-base-jp
View on GitHub
☆161Oct 19, 2020Updated 5 years ago
YerevaNN / PARASITE
View on GitHub
🪱 PARASITE || A parallel sentence data preprocessing toolkit. Originally developed as a part of the `en-ru` winner submission of WMT20 B…
☆11Jun 8, 2021Updated 5 years ago
ymym3412 / acl-papers
View on GitHub
paper summary of Association for Computational Linguistics
☆185Sep 16, 2019Updated 6 years ago
Unbabel / smaug
View on GitHub
Python package to augment multilingual data
☆15Feb 15, 2023Updated 3 years ago
ikegami-yukino / zunda-python
View on GitHub
Zunda: Japanese Enhanced Modality Analyzer client for Python.
☆10Nov 30, 2019Updated 6 years ago