neubig/kytea

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/neubig/kytea)

neubig / kytea

The Kyoto Text Analysis Toolkit for word segmentation and pronunciation estimation, etc.

☆213

Alternatives and similar repositories for kytea

Users that are interested in kytea are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chezou / Mykytea-python
View on GitHub
Python wrapper for KyTea
☆36Mar 30, 2026Updated 3 months ago
hiroki13 / neural-pasa-system
View on GitHub
☆13Apr 23, 2017Updated 9 years ago
ku-nlp / jumanpp
View on GitHub
Juman++ (a Morphological Analyzer Toolkit)
☆414Apr 17, 2026Updated 3 months ago
tarowatanabe / cicada
View on GitHub
cicada: a hypergraph-based toolkit for statistical machine translation based on {tree, string}-to-{tree, string} models
☆42Aug 9, 2021Updated 4 years ago
tkng / micter
View on GitHub
micter is a micro word segmenter which splits a sentence into words.
☆15Jun 21, 2014Updated 12 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
practical-scheme / Gauche-compat-sicp
View on GitHub
Compatibility features to run the code in SICP with Gauche
☆11Jun 29, 2024Updated 2 years ago
taku910 / mecab
View on GitHub
Yet another Japanese morphological analyzer
☆1,103Feb 22, 2025Updated last year
musyoku / hpylm
View on GitHub
HPYLMのC++実装
☆11May 2, 2017Updated 9 years ago
rakuten-nlp / rakutenma
View on GitHub
Rakuten MA - morphological analyzer (word segmentor + PoS Tagger) for Chinese and Japanese written purely in JavaScript.
☆472Feb 2, 2019Updated 7 years ago
overlast / word-vector-web-api
View on GitHub
Implementation in order to operate a web API of word vector models which are generated by Word2Vec, GloVe or e.t.c.
☆43Jul 2, 2015Updated 11 years ago
skozawa / Comainu
View on GitHub
COrpus based Morphological Analyzer with INtegrated User dictionary
☆21Mar 30, 2025Updated last year
timvieira / vocrf
View on GitHub
Variable-order CRFs with structure learning
☆17Aug 1, 2024Updated last year
buruzaemon / natto-py
View on GitHub
natto-py combines the Python programming language with MeCab, the part-of-speech and morphological analyzer for the Japanese language.
☆95Jun 6, 2024Updated 2 years ago
sicp / ikoma-sicp
View on GitHub
☆12May 23, 2012Updated 14 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Kensuke-Mitsuzawa / JapaneseTokenizers
View on GitHub
aim to use JapaneseTokenizer as easy as possible
☆138Mar 25, 2019Updated 7 years ago
neubig / pialign
View on GitHub
pialign - A Phrasal ITG Aligner
☆24Apr 29, 2019Updated 7 years ago
redpony / cdec
View on GitHub
Decoder, aligner, and model optimizer for statistical machine translation and other structured prediction models based on (mostly) contex…
☆185May 26, 2020Updated 6 years ago
neubig / nlptutorial
View on GitHub
A Tutorial about Programming for Natural Language Processing
☆438Oct 29, 2015Updated 10 years ago
atilika / kuromoji
View on GitHub
Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
☆1,052Jan 23, 2023Updated 3 years ago
kpu / mtplz
View on GitHub
Code for the paper Faster Phrase-Based Decoding by Refining Feature State
☆14Jan 9, 2023Updated 3 years ago
didi / wmt2021_triangular_mt
View on GitHub
The baseline model code for WMT 2021 Triangular MT
☆13Apr 7, 2021Updated 5 years ago
roy-ht / pyter
View on GitHub
☆27Jan 7, 2017Updated 9 years ago
shimo-lab / sembei
View on GitHub
単語分割を経由しない単語埋め込み
☆14Mar 19, 2017Updated 9 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
WorksApplications / ViSudachi
View on GitHub
A tool for visualizing the internal structures of morphological analyzer Sudachi
☆18Jun 9, 2022Updated 4 years ago
neologd / mecab-unidic-neologd
View on GitHub
Neologism dictionary based on the language resources on the Web for mecab-unidic
☆88Sep 14, 2020Updated 5 years ago
neulab / xnmt
View on GitHub
eXtensible Neural Machine Translation
☆189Sep 22, 2025Updated 10 months ago
tuzhaopeng / LC-NMT
View on GitHub
Larger-Context NMT
☆13Aug 20, 2017Updated 8 years ago
JuliaStrings / TinySegmenter.jl
View on GitHub
Julia version of TinySegmenter, compact Japanese tokenizer
☆21Nov 24, 2020Updated 5 years ago
odashi / nmtkit
View on GitHub
Neural Network-based Statistical Machine Translation Toolkit.
☆74Jul 24, 2017Updated 9 years ago
syou6162 / go-easy-first
View on GitHub
Dependency Parser with Easy-First Algorithm written in Go.
☆10Jun 28, 2026Updated last month
cocoxu / multip
View on GitHub
source code of Multiple-instance Learning Paraphrase (MultiP) Model for Twitter
☆13Jun 10, 2016Updated 10 years ago
neubig / lamtram
View on GitHub
lamtram: A toolkit for neural language and translation modeling
☆142Apr 16, 2018Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
aizhanti / JaRuNC
View on GitHub
Japanese--Russian--English News Commentary Parallel Data
☆18Jul 9, 2019Updated 7 years ago
domluna / Glove.jl
View on GitHub
Implements Global Word Vectors.
☆11Feb 8, 2020Updated 6 years ago
daac-tools / crawdad
View on GitHub
🦞 Rust library of natural language dictionaries using character-wise double-array tries.
☆38Jan 13, 2025Updated last year
tmu-nlp / JapaneseWordSimilarityDataset
View on GitHub
Japanese Word Similarity Dataset
☆103Dec 7, 2021Updated 4 years ago
yohokuno / neural_ime
View on GitHub
Neural IME: Neural Input Method Engine
☆67Dec 27, 2016Updated 9 years ago
WorksApplications / Sudachi
View on GitHub
A Japanese Tokenizer for Business
☆990Jul 14, 2026Updated 2 weeks ago
ku-nlp / KWDLC
View on GitHub
Kyoto University Web Document Leads Corpus
☆84Dec 18, 2023Updated 2 years ago