tsroten / hanzidentifierLinks
Python module that identifies Chinese text as being Simplified or Traditional
☆101Updated 10 months ago
Alternatives and similar repositories for hanzidentifier
Users that are interested in hanzidentifier are comparing it to the libraries listed below
Sorting:
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆93Updated last week
- Hanzi Converter for Traditional and Simplified Chinese☆190Updated 5 years ago
- Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).☆53Updated last year
- A CWN Python binding with graph structure☆34Updated 2 years ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆67Updated 3 months ago
- Cython wrapper on Hunspell Dictionary☆66Updated last year
- ☆174Updated 6 months ago
- 臺灣閩南語常用詞辭典 資料檔☆80Updated 2 years ago
- Constants used in Chinese text processing☆377Updated 9 months ago
- Han character library for CJKV languages☆162Updated 4 years ago
- 《香港二十世紀中期粵語語料庫》打包器☆16Updated 9 years ago
- A python module for English lemmatization and inflection.☆270Updated 2 years ago
- A Python library to parse MediaWiki WikiText☆313Updated 4 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆154Updated 2 years ago
- Simple conversion and localization between simplified and traditional Chinese using tables from MediaWiki.☆552Updated last year
- Machine-Translation-based sentence alignment tool for parallel text☆312Updated 4 years ago
- OpenCC made with Python☆562Updated last year
- A modern, interlingual wordnet interface for Python☆260Updated 3 weeks ago
- A toolbox for working with the Chinese language in Python☆150Updated 5 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆254Updated 2 years ago
- Phraseg - 一言:新詞發現工具包☆26Updated 3 years ago
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT models for Japanese and other languages☆52Updated last month
- A python module for word inflections designed for use with spaCy.☆93Updated 5 years ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆39Updated 11 months ago
- linguistics tree drawing to SVG in python, aimed at Jupyter☆65Updated last year
- A Python library for working with and comparing language codes.☆350Updated 4 months ago
- Multilingual sentence alignment using sentence embeddings☆124Updated 11 months ago
- Export UNIHAN's database to csv, json or yaml☆59Updated last week
- unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language prefere…☆68Updated 3 years ago
- ☆44Updated last year