Synkied / hanzipyLinks
Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a framework for Chinese language learners to explore Chinese.
☆25Updated 2 months ago
Alternatives and similar repositories for hanzipy
Users that are interested in hanzipy are comparing it to the libraries listed below
Sorting:
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT models for Japanese and other languages☆52Updated last month
- Sentence aligner☆117Updated 4 years ago
- Multilingual sentence alignment using sentence embeddings☆127Updated 11 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆107Updated 2 weeks ago
- Unicode-only CJKV IDS data☆12Updated last year
- A modern, interlingual wordnet interface for Python☆263Updated last month
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆39Updated 11 months ago
- Han character library for CJKV languages☆163Updated 4 years ago
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆195Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆49Updated 2 years ago
- ☆31Updated last year
- 🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python☆69Updated last week
- Improved Sentence Alignment in Linear Time and Space☆182Updated 2 years ago
- A list of vocabulary lists☆22Updated 5 years ago
- OpusFilter - Parallel corpus processing toolkit☆110Updated last week
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆31Updated 3 months ago
- This packages up data for the Open Multilingual Wordnet☆55Updated 4 months ago
- <u><a href="https://circse.github.io/LT4HALA/" style="color: white">Workshop on Language Technologies for Historical and Ancient Language…☆33Updated last year
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆14Updated 3 months ago
- HSK 3.0 Vocabulary Lists (words and characters)☆88Updated last year
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆24Updated 3 years ago
- ☆76Updated last month
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆24Updated 8 years ago
- ☆29Updated 2 weeks ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆94Updated last week
- Spoken Cantonese from Hong Kong.☆30Updated last month
- Efficient Low-Memory Aligner☆146Updated 8 months ago
- Raw text of 申報☆26Updated 3 years ago
- Machine-Translation-based sentence alignment tool for parallel text☆312Updated 4 years ago
- Ideographic Description Sequence Checker Tools☆25Updated 8 years ago