berniey / hanziconvLinks
Hanzi Converter for Traditional and Simplified Chinese
☆188Updated 5 years ago
Alternatives and similar repositories for hanziconv
Users that are interested in hanziconv are comparing it to the libraries listed below
Sorting:
- Constants used in Chinese text processing☆372Updated 6 months ago
- Simple conversion and localization between simplified and traditional Chinese using tables from MediaWiki.☆541Updated last year
- OpenCC made with Python☆556Updated last year
- Python module that identifies Chinese text as being Simplified or Traditional☆95Updated 7 months ago
- 汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components…☆381Updated 8 months ago
- Chinese stopwords collection☆137Updated 5 years ago
- Estimate the phonetic distance between Chinese words and get similar sounding candidate words.☆37Updated last month
- 词语拼音数据☆487Updated 2 months ago
- ☆93Updated this week
- ☆125Updated 4 years ago
- 漢語拆字字典☆781Updated 2 years ago
- A tool for ancient Chinese segmentation.☆53Updated 6 years ago
- This is a corpus of Chinese abbreviation, including negative full forms.☆196Updated 3 years ago
- 中文相关词典和语料库。☆174Updated 10 years ago
- [本项目不再维护] 将汉字转换为拼音, 支持多音字,拼音 -> pin yin☆211Updated last month
- A Chinese sentiment dataset may be useful for sentiment analysis.☆232Updated 8 years ago
- ☆39Updated last year
- Converting Chinese number string <=> int/float/str☆19Updated last month
- Cantonese Linguistics and NLP☆383Updated last year
- 对常用的6700个汉字进行音、形比较,输出音近字、形近字的列表。 # 相近字☆459Updated last year
- 粤语分词工具☆48Updated 6 years ago
- 转换搜狗拼音词库为txt文件☆50Updated 7 years ago
- 中文分词软件基准测试 | Chinese tokenizer benchmark☆24Updated 6 years ago
- Utility scripts or libraries for various Natural Language Processing tasks.☆39Updated 3 years ago
- 📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)☆729Updated 6 months ago
- 各大中文分词性能评测☆157Updated 6 years ago
- 人民日报1998年1-4月中文标注语料库☆32Updated 6 years ago
- 一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a…☆154Updated 8 months ago
- 古典中文語料庫☆285Updated 3 years ago
- 汉字字符特征提取器 (featurizer),提取汉字的特征(发音特征、字形特征)用做深度学习的特征 | A Chinese character feature extractor, which extracts the features of Chinese charac…☆295Updated 4 years ago