tsroten / hanzidentifier
Python module that identifies Chinese text as being Simplified or Traditional
☆86Updated this week
Related projects ⓘ
Alternatives and complementary repositories for hanzidentifier
- Hanzi Converter for Traditional and Simplified Chinese☆181Updated 4 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆82Updated this week
- 臺灣閩南語常用詞辭典 資料檔☆76Updated last year
- Constants used in Chinese text processing☆359Updated last year
- Identification and conversion functions for Chinese text processing☆58Updated 5 months ago
- 台語、族語、客語的語料清單、彙整☆39Updated 4 years ago
- Converts between traditional and simplified Chinese☆30Updated 2 months ago
- an open solution for collecting n-gram Chinese lexicon and n-gram statistics☆74Updated 8 years ago
- 粵文語料篩選器 Cantonese text filter☆33Updated 2 months ago
- Han character library for CJKV languages☆150Updated 3 years ago
- Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).☆55Updated 8 months ago
- Spoken Cantonese from Hong Kong.☆29Updated last week
- A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP☆85Updated 3 years ago
- An English-to-Cantonese machine translation model☆49Updated 7 months ago
- 第一個開放的客語斷詞工具☆13Updated 6 years ago
- Estimate the phonetic distance between Chinese words and get similar sounding candidate words.☆35Updated last year
- OpenCC binding for Python.☆52Updated 4 years ago
- Input a Chinese character. Output all the variant characters of it.☆19Updated 2 months ago
- Phraseg - 一言:新詞發現工具包☆26Updated 2 years ago
- 漢語拼音轉換表☆35Updated 3 years ago
- Export UNIHAN's database to csv, json or yaml☆52Updated this week
- ☆29Updated 5 months ago
- 《香港二十世紀中期粵語語料庫》打包器☆16Updated 8 years ago
- A tool for ancient Chinese segmentation.☆53Updated 5 years ago
- A CWN Python binding with graph structure☆26Updated last year
- Pre-trained ELECTRA from Hong Kong data☆27Updated 4 years ago
- 中華大辭典☆113Updated last year
- 教育部重編國語辭典 資料檔; 若有建議或 bug 請在 moedict-process 反應☆134Updated last year
- Cython wrapper on Hunspell Dictionary☆65Updated 4 months ago
- OpenCC made with Python☆537Updated 11 months ago