ueda-keisuke / CC-CEDICT-MeCabLinks

CC-CEDICT-MeCab is a MeCab dictionary for Chinese (Mandarin) text segmentation

☆13

Alternatives and similar repositories for CC-CEDICT-MeCab

Users that are interested in CC-CEDICT-MeCab are comparing it to the libraries listed below

Sorting:

jqk09a / japanese-daily-dialogue
☆53Updated 2 years ago
KoichiYasuoka / SuPar-UniDic
Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese with BERT models
☆20Updated 2 months ago
daac-tools / crawdad
🦞 Rust library of natural language dictionaries using character-wise double-array tries.
☆36Updated 10 months ago
wtetsu / deinja
🌸De-inflect Japanese words
☆15Updated 7 months ago
yagays / kanjivg-radical
☆102Updated 7 years ago
thino-rma / fts5_mecab
sqlite3 fts5 mecab
☆22Updated 6 years ago
WorksApplications / chikkarpy
Japanese synonym library
☆54Updated 3 years ago
WorksApplications / ViSudachi
A tool for visualizing the internal structures of morphological analyzer Sudachi
☆18Updated 3 years ago
shogo82148 / TinySegmenterMaker
☆72Updated 3 years ago
polm / unidic-lite
A small version of UniDic for easy pip installs.
☆47Updated 5 years ago
Japanese-Accent-Circle / japanese-accent
☆11Updated 3 years ago
daac-tools / vaporetto
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
☆248Updated 2 weeks ago
ikegami-yukino / rakutenma-python
Rakuten MA (Python version)
☆22Updated 8 years ago
ndl-lab / huriganacorpus-ndlbib
全国書誌データから作成した振り仮名のデータセット
☆28Updated 4 years ago
s-taka / fugumt
☆63Updated 4 years ago
megagonlabs / bunkai
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
☆199Updated last year
WorksApplications / uzushio
☆23Updated 9 months ago
wwwcojp / ja_sentence_segmenter
japanese sentence segmentation library for python
☆73Updated 2 years ago
neologd / mecab-unidic-neologd
Neologism dictionary based on the language resources on the Web for mecab-unidic
☆86Updated 5 years ago
WorksApplications / SudachiDict
A lexicon for Sudachi
☆267Updated 2 weeks ago
fulmicoton / kuromoji-rs
Japanese tokenizer for rust
☆35Updated 6 years ago
WorksApplications / sudachi.rs
Sudachi in Rust 🦀 and new generation of SudachiPy
☆396Updated 5 months ago
masayu-a / WLSP
Word List by Semantic Principles (WLSP): “It is a collection of words classified and arranged by their meanings”
☆59Updated 3 months ago
polm / unidic-py
Unidic packaged for installation via pip.
☆106Updated 8 months ago
himkt / konoha
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
☆259Updated 6 months ago
yas-sim / handwritten-japanese-ocr
Handwritten Japanese OCR demo using touch panel to draw the input text using Intel OpenVINO toolkit
☆38Updated 3 years ago
KoichiYasuoka / UniDic2UD
Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese
☆37Updated last year
azooKey / AJIMEE-Bench
AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)
☆11Updated 10 months ago
ikegami-yukino / mozcpy
Mozc for Python: Kana-Kanji converter
☆46Updated 9 months ago
taishi-i / toiro
A tool for comparing tokenizers
☆120Updated last week