yishn / chinese-tokenizer
Tokenizes Chinese texts into words.
☆96Updated 2 years ago
Alternatives and similar repositories for chinese-tokenizer:
Users that are interested in chinese-tokenizer are comparing it to the libraries listed below
- A tool to find grammar patterns in Chinese text☆26Updated 5 years ago
- Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.☆109Updated last year
- Split {Japanese, English} text into sentences.☆123Updated last year
- Python module that identifies Chinese text as being Simplified or Traditional☆90Updated 3 months ago
- 教育部重編國語辭典 資料檔; 若有建議或 bug 請在 moedict-process 反應☆139Updated 2 years ago
- Han character library for CJKV languages☆155Updated 4 years ago
- CLDR text segmentation for JavaScript☆38Updated 10 months ago
- 臺灣閩南語常用詞辭典 資料檔☆77Updated last year
- HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js☆379Updated 5 months ago
- English Lemma Database - Compiled by Referencing British National Corpus☆29Updated 5 months ago
- Practice Chinese language grammar☆16Updated 3 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆162Updated 10 months ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- 開放漢語字典 - 現代漢語字音數據庫☆22Updated 4 years ago
- Chrome extension that translates Chinese words when hovering on them.☆37Updated 2 years ago
- Free, open-source Chinese handwriting recognition in Javascript☆148Updated 5 years ago
- An experimental webpage for observing Chinese natural language processing. It demonstrates the processes of decomposition, transformation…☆64Updated 9 months ago
- 中華大辭典☆117Updated last year
- Chinese (zh-cnm) opendata audio files for 8,596 hsk words and 1,707 syllabs.☆45Updated 3 years ago
- Cantonese Romanization Converter☆16Updated 3 years ago
- The JavaScript version of Open Chinese Convert (OpenCC)☆265Updated 2 years ago
- 《国际中文教育中文水平等级标准》 查询系统 Query System of Chinese Proficiency Grading Standards for International Chinese Language Education, New HSK Levels …☆28Updated 11 months ago
- Convert *.LD2 dictionaries format into human-readable text files☆66Updated 12 years ago
- Enter only simplified characters and create word meaning with Traditional, Pinyin, Meaning, Audio and example sentences☆30Updated 3 years ago
- 零時字引-部件筆劃查字☆24Updated 8 years ago
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆19Updated last year
- 漢字データベースの辞書関連データ☆91Updated 2 years ago
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆66Updated 3 years ago
- Stroke order SVG files for Chinese Hanzi characters☆39Updated last year
- Node.js Interface for CC-CEDICT (http://cc-cedict.org/)☆26Updated 8 years ago