sunpinyin / open-gram
an open solution for collecting n-gram Chinese lexicon and n-gram statistics
☆73Updated 9 years ago
Alternatives and similar repositories for open-gram:
Users that are interested in open-gram are comparing it to the libraries listed below
- Yet another Chinese word segmentation package based on character-based tagging heuristics and CRF algorithm☆245Updated 12 years ago
- ZPar statistical parser. Universal language support (depending on the availability of training data), with language-specific features for…☆135Updated 8 years ago
- ☆92Updated 4 months ago
- Software for unsupervised word segmentation and language model learning using lattices☆45Updated 8 years ago
- Chinese word segmentation module of LTP☆46Updated 9 years ago
- Clone of "A Good Part-of-Speech Tagger in about 200 Lines of Python" by Matthew Honnibal☆48Updated 8 years ago
- OpenCC binding for Python.☆52Updated 4 years ago
- Spelling Corrector for Input Method Engine (IME)☆31Updated 9 years ago
- An Efficient Lexical Analyzer for Chinese☆42Updated 5 years ago
- sequence labeling by neural network☆17Updated 7 years ago
- Utility scripts or libraries for various Natural Language Processing tasks.☆39Updated 3 years ago
- Chinese Words Segment Library based on HMM model☆167Updated 10 years ago
- 中文分词软件基准测试 | Chinese tokenizer benchmark☆23Updated 6 years ago
- Recurrent Neural Networks(GRU) for character-level language models on Chinese, in Python/Theano☆63Updated 7 years ago
- 中文 NLP 语料库数据集☆20Updated 6 years ago
- auto generate chinese words in huge text.☆91Updated 10 years ago
- 中文自然语言处理工具包☆86Updated 9 years ago
- An open-access corpus of conversational bilingual speech in Cantonese and English☆40Updated 2 years ago
- EMNLP2015_code_Long Short-Term Memory Neural Networks for Chinese Word Segmentation☆77Updated 9 years ago
- a python project for getting pinyin for Chinese words or sentence☆8Updated 6 years ago
- 人民日报1998年1-4月中文标注语料库☆30Updated 6 years ago
- ☆129Updated 7 years ago
- ☆6Updated 7 years ago
- ctbparser是一个用C++语言实现的开源的中文处理工具包(GBK编码),用于分词、词性标注、依存句法分析,采用的是中文宾州树库(Chinese Tree Bank, CTB)标准。☆12Updated 10 years ago
- 绝对有趣的中文发音引擎 funny chinese text to speech enginee☆51Updated 11 years ago
- A C++ toolkit for neural machine translation for CPU☆88Updated 5 years ago
- a text analyzing (match, rewrite, extract) engine (python edition)☆80Updated 7 years ago
- Chinese Synonym Library☆123Updated 6 years ago
- Constants used in Chinese text processing☆371Updated 3 months ago
- 中文相关词典和语料库。☆172Updated 10 years ago