Synkied / hanzipyLinks
Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a framework for Chinese language learners to explore Chinese.
☆27Updated 3 months ago
Alternatives and similar repositories for hanzipy
Users that are interested in hanzipy are comparing it to the libraries listed below
Sorting:
- Unicode-only CJKV IDS data☆13Updated last year
- Multilingual sentence alignment using sentence embeddings☆130Updated last year
- 粵文語料篩選器 Cantonese text filter☆41Updated 7 months ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆206Updated last year
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆94Updated last week
- Han character library for CJKV languages☆163Updated 4 years ago
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT models for Japanese and other languages☆52Updated 2 months ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆39Updated last year
- A frequency lexicon for Hong Kong Cantonese☆23Updated 5 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆108Updated last week
- Chinese lexicon containing definitions, character origins, and statistics, built for Dong Chinese (https://www.dong-chinese.com)☆53Updated 5 years ago
- ☆32Updated 2 years ago
- Ideographic Description Sequence Checker Tools☆25Updated 8 years ago
- A modern, interlingual wordnet interface for Python☆272Updated this week
- Find Chinese sentences based on your known vocabulary and other rules☆64Updated last year
- Sentence aligner☆120Updated 4 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆153Updated 5 years ago
- Gather modern English word frequencies from all enwiki articles.☆227Updated last year
- This packages up data for the Open Multilingual Wordnet☆56Updated 5 months ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆31Updated 4 months ago
- OpusFilter - Parallel corpus processing toolkit☆112Updated last week
- Code for paper "Kanbun-LM: Reading and Translating Classical Chinese in Japanese Method by Language Models"☆18Updated 2 years ago
- ☆78Updated 3 months ago
- A list of vocabulary lists☆22Updated 5 years ago
- Spoken Cantonese from Hong Kong.☆30Updated last week
- Machine-Translation-based sentence alignment tool for parallel text☆313Updated 4 years ago
- ☆29Updated last week
- The World Atlas of Language Structures☆69Updated last year
- HSK 3.0 Vocabulary Lists (words and characters)☆90Updated 2 years ago
- <u><a href="https://circse.github.io/LT4HALA/" style="color: white">Workshop on Language Technologies for Historical and Ancient Language…☆34Updated last year