vincentzlt/textprep

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/vincentzlt/textprep)

vincentzlt / textprep

Textprep is an analyzing tool for both parallel and non-parallel corpus and its down-stream Natural Language Processing and Machine Translation tasks. It is designed especially for logographic languages such as Chinese and Japanese.

☆32

Alternatives and similar repositories for textprep

Users that are interested in textprep are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

iesl / diora
View on GitHub
Deep Inside-Outside Recursive Autoencoder
☆89Jan 17, 2022Updated 4 years ago
ajb129 / KeyakiTreebank
View on GitHub
Keyaki Treebank Parsed Corpus
☆10May 15, 2019Updated 7 years ago
tmu-nlp / sscorpus
View on GitHub
A monolingual parallel corpus for sentence simplification
☆11Jul 4, 2016Updated 10 years ago
windsuzu / Joint-Semantic-Phonetic-Embedding
View on GitHub
We use phonetics as a feature to create a joint semantic-phonetic embedding and improve the neural machine translation between Chinese an…
☆12Aug 3, 2021Updated 4 years ago
dennlinger / klexikon
View on GitHub
Klexikon: A German Dataset for Joint Summarization and Simplification
☆17Oct 5, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
akikoe / nmtrnng
View on GitHub
C++ code of "Learning to Parse and Translate Improves Neural Machine Translation"
☆21May 8, 2017Updated 9 years ago
odashi / nmtkit
View on GitHub
Neural Network-based Statistical Machine Translation Toolkit.
☆74Jul 24, 2017Updated 8 years ago
aaronmueller / clams
View on GitHub
Syntactic evaluation sets, attribute-varying grammars, and code for replicating the CLAMS paper. ACL 2020.
☆17Nov 26, 2024Updated last year
fanfannothing / RNTN
View on GitHub
Recursive Neural Tensor Networks
☆11Feb 3, 2014Updated 12 years ago
masayu-a / NAIST-JENE
View on GitHub
☆10Aug 13, 2012Updated 13 years ago
OpenNMT / Hackathon
View on GitHub
Resources for the OpenNMT hackathon
☆51May 24, 2019Updated 7 years ago
bcmi220 / seq2seq_parser
View on GitHub
☆20Sep 23, 2018Updated 7 years ago
musyoku / chainer-nn
View on GitHub
☆10Oct 16, 2017Updated 8 years ago
salesforce / localization-xml-mt
View on GitHub
A High-Quality Multilingual Dataset for Structured Documentation Translation
☆39May 1, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ikegami-yukino / zunda-python
View on GitHub
Zunda: Japanese Enhanced Modality Analyzer client for Python.
☆10Nov 30, 2019Updated 6 years ago
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
akikoe / tree2seq
View on GitHub
C++ code of "Tree-to-Sequence Attentional Neural Machine Translation (tree2seq ANMT)"
☆57Jun 23, 2017Updated 9 years ago
konabuta / Automated-ML-Workshop
View on GitHub
AutoML Workshop (Azure Machine Learning mainly)
☆13Jan 5, 2020Updated 6 years ago
gesceap / prime16
View on GitHub
Nanoloop source files for the album "Prime 16"
☆11Mar 7, 2026Updated 4 months ago
clayandgithub / rnn_cws
View on GitHub
chinese word segmentation based on rnn
☆13Oct 14, 2016Updated 9 years ago
nandenjin / itfdic
View on GitHub
A localized word dictionary asset for University of Tsukuba
☆12Sep 19, 2025Updated 10 months ago
roeeaharoni / string-to-tree-nmt
View on GitHub
Source code and data for the paper "Towards String-to-Tree Neural Machine Translation"
☆16Dec 31, 2017Updated 8 years ago
facebookresearch / QA-Overlap
View on GitHub
Code to support the paper "Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets"
☆66Aug 31, 2021Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
rubythonode / joint-many-task-model
View on GitHub
Multiple Different Natural Language Processing Tasks in a Single Deep Model
☆48Dec 5, 2018Updated 7 years ago
RLSNLP / Document-level-text-simplification
View on GitHub
The repository contains the dataset and the code of the paper: Document-Level Text Simplification: Dataset, Metric and Model.
☆25Jun 2, 2023Updated 3 years ago
mkazutaka / 20231219-llmapp-meetup
View on GitHub
☆12Dec 19, 2023Updated 2 years ago
amittai / cynical
View on GitHub
Cynical data selection
☆20Jan 16, 2021Updated 5 years ago
akirakubo / mecab-mozcdic
View on GitHub
☆10Jan 12, 2018Updated 8 years ago
jsenellart / papers
View on GitHub
This repo is containing notes and implementations for cherry-picked publications of my particular interest
☆12May 14, 2020Updated 6 years ago
discourse-lab / DiscourseSegmenter
View on GitHub
A collection of various discourse segmenters
☆10Jun 30, 2017Updated 9 years ago
salesforce / coco-dst
View on GitHub
☆53Jun 2, 2026Updated last month
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
tticoin / AntonymDetection
View on GitHub
Implementation of Word Embedding-based Antonym Detection using Thesauri and Distributional Information in NAACL2015
☆35Mar 8, 2022Updated 4 years ago
kiyukuta / lencon
View on GitHub
Implementation of "Controlling Output Length in Neural Encoder-Decoders"
☆42Jan 29, 2018Updated 8 years ago
alvations / expletives
View on GitHub
Expletives vomiting library...
☆13Apr 18, 2026Updated 3 months ago
SimengSun / revisit-nplm
View on GitHub
☆12Sep 1, 2021Updated 4 years ago
vanhuyz / neural-machine-translation-demo
View on GitHub
Neural Machine Translation Experiment with TensorFlow
☆10May 22, 2017Updated 9 years ago
jojonki / Taiyaki
View on GitHub
PythonとCythonで出来てる日本語形態素解析エンジン🚧
☆13Dec 4, 2019Updated 6 years ago
nekoya / garoonbot
View on GitHub
☆10Feb 20, 2017Updated 9 years ago