yishn / chinese-tokenizerLinks
Tokenizes Chinese texts into words.
☆100Updated 2 years ago
Alternatives and similar repositories for chinese-tokenizer
Users that are interested in chinese-tokenizer are comparing it to the libraries listed below
Sorting:
- HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js☆397Updated last year
- Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.☆116Updated 2 years ago
- A tool to find grammar patterns in Chinese text☆28Updated 5 years ago
- CLDR text segmentation for JavaScript☆38Updated last year
- Split {Japanese, English} text into sentences.☆135Updated last year
- 臺灣閩南語常用詞辭典 資料檔☆78Updated 2 years ago
- A JavaScript Chinese word segmentation tool based on Python Jieba☆51Updated 11 years ago
- English Lemma Database - Compiled by Referencing British National Corpus☆33Updated last year
- Han character library for CJKV languages☆164Updated 4 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆104Updated last year
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆67Updated 4 years ago
- An experimental webpage for observing Chinese natural language processing. It demonstrates the processes of decomposition, transformation…☆68Updated last year
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆48Updated 2 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆208Updated last year
- Draw animated Japanese characters (Kanji and Kana), Korean characters (Hanja) and Chinese characters (Hanzi) in correct stroke order usin…☆357Updated last month
- 台語、族語、客語的語料清單、彙整☆44Updated 5 years ago
- Free, open-source Chinese handwriting recognition in Javascript☆164Updated 6 years ago
- Monorepo for Kanji, Furigana, Japanese DB, and others☆59Updated 2 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆94Updated last week
- 教育部重編國語辭典 資料檔; 若有建議或 bug 請在 moedict-process 反應☆150Updated 2 years ago
- All the words from Google Books, sorted by frequency☆119Updated 2 years ago
- FastText for Node.js☆198Updated 2 years ago
- 開放漢語字典 - 現代漢語字音數據庫☆24Updated 5 years ago
- OpenCC implementation for pure Node.js☆64Updated 7 years ago
- Implement the supermemo 2 algorithm.☆81Updated 3 years ago
- Chinese language vocabulary graph generation. Python/Flask tool that performs dictionary search and analysis on Chinese Hanzi characters.…☆151Updated 2 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆154Updated 5 years ago
- 中華大辭典☆122Updated 2 years ago
- Spaced repetition for memorizing tons of things.☆165Updated 10 years ago
- The JavaScript version of Open Chinese Convert (OpenCC)☆308Updated 3 years ago