yishn / chinese-tokenizerLinks
Tokenizes Chinese texts into words.
☆98Updated 2 years ago
Alternatives and similar repositories for chinese-tokenizer
Users that are interested in chinese-tokenizer are comparing it to the libraries listed below
Sorting:
- HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js☆382Updated 8 months ago
- Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.☆110Updated last year
- A tool to find grammar patterns in Chinese text☆27Updated 5 years ago
- CLDR text segmentation for JavaScript☆38Updated last year
- Chinese (zh-cnm) opendata audio files for 8,596 hsk words and 1,707 syllabs.☆45Updated 4 years ago
- Split {Japanese, English} text into sentences.☆129Updated last year
- Chrome extension that translates Chinese words when hovering on them.☆40Updated 2 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆173Updated last year
- Draw animated Japanese characters (Kanji and Kana), Korean characters (Hanja) and Chinese characters (Hanzi) in correct stroke order usin…☆327Updated 5 months ago
- Some of the stuff I am currently using for my Chinese studies☆13Updated 6 years ago
- Chinese lexicon containing definitions, character origins, and statistics, built for Dong Chinese (https://www.dong-chinese.com)☆46Updated 4 years ago
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆46Updated 2 years ago
- English lemmatizer☆67Updated 2 years ago
- Stroke order SVG files for Chinese Hanzi characters☆41Updated last year
- English Lemma Database - Compiled by Referencing British National Corpus☆31Updated 9 months ago
- Python module that identifies Chinese text as being Simplified or Traditional☆95Updated 7 months ago
- Han character library for CJKV languages☆158Updated 4 years ago
- 開放漢語字典 - 現代漢語字音數據庫☆23Updated 4 years ago
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆66Updated 3 years ago
- Node.js Interface for CC-CEDICT (http://cc-cedict.org/)☆27Updated 8 years ago
- Das Chinesisch-Deutsche Wörterbuch HanDeDict, das bis August 2015 auf der Webseite von CHDW verfügbar war.☆22Updated this week
- Open Language Profiles — English profile datasets from CEFR-J☆130Updated 5 years ago
- Node module wrapper for WordNet dictionary.☆54Updated 3 years ago
- A JavaScript Chinese word segmentation tool based on Python Jieba☆47Updated 11 years ago
- Implement the supermemo 2 algorithm.☆81Updated 2 years ago
- Gather modern English word frequencies from all enwiki articles.☆216Updated last year
- Convert Chinese text to Pinyin or Jyutping☆28Updated last year
- Generate decks for Anki (spaced repetition software)☆166Updated 2 years ago
- Convert a Chinese sentence to Pinyin or Jyutping☆64Updated 2 years ago
- cc-kedict: Creative Commons Korean-English Dictionary☆41Updated 3 years ago