yishn / chinese-tokenizerLinks
Tokenizes Chinese texts into words.
☆100Updated 3 years ago
Alternatives and similar repositories for chinese-tokenizer
Users that are interested in chinese-tokenizer are comparing it to the libraries listed below
Sorting:
- HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js☆399Updated last year
- Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.☆117Updated 2 years ago
- A tool to find grammar patterns in Chinese text☆28Updated 6 years ago
- A JavaScript Chinese word segmentation tool based on Python Jieba☆51Updated 12 years ago
- 臺灣閩南語常用詞辭典 資料檔☆79Updated 2 years ago
- FastText for Node.js☆199Updated 2 years ago
- Han character library for CJKV languages☆164Updated 4 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆105Updated last year
- 教育部重編國語辭典 資料檔; 若有建議或 bug 請在 moedict-process 反應☆152Updated 2 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆219Updated last week
- 開放漢語字典 - 現代漢語字音數據庫☆24Updated 5 years ago
- 台語、族語、客語的語料清單、彙整☆46Updated 5 years ago
- Cantonese Linguistics and NLP☆394Updated last year
- Cantonese Romanization Converter☆18Updated 4 years ago
- An experimental webpage for observing Chinese natural language processing. It demonstrates the processes of decomposition, transformation…☆69Updated last year
- 粵文語料篩選器 Cantonese text filter☆41Updated 9 months ago
- Draw animated Japanese characters (Kanji and Kana), Korean characters (Hanja) and Chinese characters (Hanzi) in correct stroke order usin…☆372Updated 2 months ago
- Chinese lexicon containing definitions, character origins, and statistics, built for Dong Chinese (https://www.dong-chinese.com)☆56Updated last month
- 這棵橡木是松鼠的☆28Updated 9 years ago
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆69Updated 4 years ago
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆49Updated 2 years ago
- 中華大辭典☆122Updated 2 years ago
- Chinese (zh-cnm) opendata audio files for 8,596 hsk words and 1,707 syllabs.☆59Updated 4 years ago
- Stroke order SVG files for Chinese Hanzi characters☆45Updated 2 years ago
- Chrome extension that translates Chinese words when hovering on them.☆40Updated 2 years ago
- Text to IPA converter in JavaScript☆58Updated 3 years ago
- fastText vectors created from Hong Kong data.☆22Updated 5 years ago
- rime-cantonese 上游詞表倉庫☆30Updated last week
- Text corpus calculation in Javascript. Supports Chinese, English.☆81Updated 4 years ago
- The JavaScript version of Open Chinese Convert (OpenCC)☆315Updated 3 years ago