himkt / konohaLinks
๐ฟ An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
โ251Updated last month
Alternatives and similar repositories for konoha
Users that are interested in konoha are comparing it to the libraries listed below
Sorting:
- Python version of Sudachi, a Japanese tokenizer.โ405Updated 2 years ago
- Juman++ (a Morphological Analyzer Toolkit)โ392Updated last year
- A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.โ457Updated 3 weeks ago
- A Japanese tokenizer based on recurrent neural networksโ401Updated last month
- A comparison tool of Japanese tokenizersโ121Updated last year
- ๐ A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm informationโ130Updated 2 years ago
- Unidic packaged for installation via pip.โ97Updated 4 months ago
- Sudachi in Rust ๐ฆ and new generation of SudachiPyโ361Updated last week
- mecab-python. you can find original version here//taku910.github.io/mecab/โ559Updated 7 months ago
- Sentence boundary disambiguation tool for Japanese texts (ๆฅๆฌ่ชๆๅข็ๅคๅฎๅจ)โ191Updated last year
- A lexicon for Sudachiโ254Updated last month
- natto-py combines the Python programming language with MeCab, the part-of-speech and morphological analyzer for the Japanese language.โ94Updated last year
- An integrated Japanese analyzer based on foundation modelsโ133Updated 3 weeks ago
- Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkakuโ331Updated 5 months ago
- aim to use JapaneseTokenizer as easy as possibleโ139Updated 6 years ago
- BERT models for Japanese text.โ535Updated last year
- Lightweight converter from Japanese Kana-kanji sentences into Kana-Roman.โ433Updated 2 years ago
- ๐ฅ Vaporetto: Very accelerated pointwise prediction based tokenizerโ238Updated 2 weeks ago
- A small version of UniDic for easy pip installs.โ43Updated 4 years ago
- A Python Module for JUMAN++/KNPโ91Updated last week
- Japanese word embedding with Sudachi and NWJC ๐ฟโ165Updated last year
- A Japanese NLP Library using spaCy as framework based on Universal Dependenciesโ796Updated last year
- Japanese text normalizer for mecab-neologdโ280Updated 3 months ago
- Japanese tokenizer for Transformersโ79Updated last year
- Kyoto University Web Document Leads Corpusโ83Updated last year
- Wikipediaใ็จใใๆฅๆฌ่ชใฎๅบๆ่กจ็พๆฝๅบใใผใฟใปใใโ141Updated last year
- Japanese Word Similarity Datasetโ101Updated 3 years ago
- This repository is archived! The maintained MeCab can be found https://github.com/shogo82148/mecabโ258Updated 8 months ago
- JGLUE: Japanese General Language Understanding Evaluationโ318Updated 2 months ago
- Emotion analyzer for Japanese textโ115Updated 11 months ago