yishn / chinese-tokenizerLinks
Tokenizes Chinese texts into words.
☆99Updated 2 years ago
Alternatives and similar repositories for chinese-tokenizer
Users that are interested in chinese-tokenizer are comparing it to the libraries listed below
Sorting:
- Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.☆112Updated 2 years ago
- HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js☆386Updated 10 months ago
- A tool to find grammar patterns in Chinese text☆27Updated 5 years ago
- Split {Japanese, English} text into sentences.☆133Updated last year
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆66Updated 3 years ago
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆47Updated 2 years ago
- FastText for Node.js☆196Updated 2 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆183Updated last year
- Implement the supermemo 2 algorithm.☆81Updated 3 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆145Updated 5 years ago
- Sentence Boundary Detection in javascript for node. http://tessmore.github.io/sbd/☆215Updated last year
- The 134,000+ words and their pronunciations in the CMU pronouncing dictionary☆79Updated 4 years ago
- Draw animated Japanese characters (Kanji and Kana), Korean characters (Hanja) and Chinese characters (Hanzi) in correct stroke order usin…☆340Updated 7 months ago
- Enter only simplified characters and create word meaning with Traditional, Pinyin, Meaning, Audio and example sentences☆31Updated 4 years ago
- Han character library for CJKV languages☆160Updated 4 years ago
- English Lemma Database - Compiled by Referencing British National Corpus☆32Updated 11 months ago
- Text to IPA converter in JavaScript☆58Updated 2 years ago
- Spaced repetition for memorizing tons of things.☆164Updated 10 years ago
- Export UNIHAN's database to csv, json or yaml☆59Updated this week
- Google TTS (Text-To-Speech) for node.js☆286Updated 2 years ago
- Generate decks for Anki (spaced repetition software)☆167Updated 2 years ago
- Node.js Interface for CC-CEDICT (http://cc-cedict.org/)☆27Updated 8 years ago
- Chinese lexicon containing definitions, character origins, and statistics, built for Dong Chinese (https://www.dong-chinese.com)☆48Updated 5 years ago
- Chinese language vocabulary graph generation. Python/Flask tool that performs dictionary search and analysis on Chinese Hanzi characters.…☆143Updated 2 years ago
- Fetch youtube user submitted or fallback to auto-generated captions☆313Updated last year
- 開放漢語字典 - 現代漢語字音數據庫☆24Updated 4 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆100Updated 9 months ago
- Stroke order SVG files for Chinese Hanzi characters☆42Updated last year
- Chrome extension that translates Chinese words when hovering on them.☆40Updated 2 years ago
- cc-kedict: Creative Commons Korean-English Dictionary☆41Updated 4 years ago