yishn / chinese-tokenizerLinks
Tokenizes Chinese texts into words.
☆100Updated 2 years ago
Alternatives and similar repositories for chinese-tokenizer
Users that are interested in chinese-tokenizer are comparing it to the libraries listed below
Sorting:
- A tool to find grammar patterns in Chinese text☆27Updated 5 years ago
- HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js☆389Updated 11 months ago
- Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.☆115Updated 2 years ago
- Han character library for CJKV languages☆161Updated 4 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆100Updated 10 months ago
- Gather modern English word frequencies from all enwiki articles.☆222Updated last year
- English Lemma Database - Compiled by Referencing British National Corpus☆32Updated 11 months ago
- 開放漢語字典 - 現代漢語字音數據庫☆24Updated 4 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆147Updated 5 years ago
- 臺灣閩南語常用詞辭典 資料檔☆80Updated 2 years ago
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆47Updated 2 years ago
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆66Updated 3 years ago
- Node.js Interface for CC-CEDICT (http://cc-cedict.org/)☆27Updated 8 years ago
- A JavaScript Chinese word segmentation tool based on Python Jieba☆49Updated 11 years ago
- Draw animated Japanese characters (Kanji and Kana), Korean characters (Hanja) and Chinese characters (Hanzi) in correct stroke order usin…☆347Updated 8 months ago
- FastText for Node.js☆197Updated 2 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆190Updated last year
- Free, open-source Chinese handwriting recognition in Javascript☆163Updated 6 years ago
- Chinese language vocabulary graph generation. Python/Flask tool that performs dictionary search and analysis on Chinese Hanzi characters.…☆146Updated 2 years ago
- IDS data for CJK Unified Ideographs☆458Updated 2 years ago
- Cantonese Romanization Converter☆17Updated 4 years ago
- Chrome extension that translates Chinese words when hovering on them.☆40Updated 2 years ago
- 漢字データベースの辞書関連データ☆101Updated 2 years ago
- Text to IPA converter in JavaScript☆58Updated 3 years ago
- Implement the supermemo 2 algorithm.☆81Updated 3 years ago
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆90Updated 2 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆92Updated last week
- rime-cantonese 上游 詞表倉庫☆30Updated last year
- A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP☆93Updated 3 years ago
- 這棵橡木是松鼠的☆25Updated 9 years ago