znwang25 / fuzzychinese
A small package to fuzzy match chinese words
☆83Updated last year
Alternatives and similar repositories for fuzzychinese:
Users that are interested in fuzzychinese are comparing it to the libraries listed below
- 常用中文停用词表及对比☆66Updated 5 years ago
- Chinese Subjective Dectection based on subjective knowlegebase, 中文主观性计算。基于中文主观性知识库的句子主观性评定方法。☆57Updated last year
- 转换搜狗拼音词库为txt文件☆51Updated 7 years ago
- 各大中文分词性能评测☆155Updated 6 years ago
- 百度百科爬虫☆71Updated 8 months ago
- Estimate the phonetic distance between Chinese words and get similar sounding candidate words.☆36Updated last year
- Corpus creator for Chinese Wikipedia☆41Updated 3 years ago
- Chinese Sentiment Analysis 中文文本情感分析☆184Updated last year
- chinese anti semantic word search interface based on dict crawled from online resources, ChineseAntiword,针对中文词语的反义词查询接口☆59Updated 6 years ago
- CLUE Emotion Analysis Dataset 细粒度情感分析数据集☆8Updated 5 years ago
- Tutorial for Chinese Sentiment analysis with hotel review data☆46Updated 7 years ago
- 中文分词软件基准测试 | Chinese tokenizer benchmark☆23Updated 6 years ago
- company name parser, extract company name brand. 中文公司名称分词工具,支持公司名称中的地名,品牌名(主词),行业词,公司名后缀提取。☆85Updated 2 years ago
- 一个轻量且 功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a…☆146Updated 4 months ago
- An exploration for Eventline (important news Rank organized by pulic time),针对某一事件话题下的新闻报道集合,通过使用docrank算法,对新闻报道进行重要性识别,并通过新闻报道时间挑选出时间线上重要…☆215Updated 6 years ago
- Self complemented sentiment words expansion using seed sentiment words and so-pmi , this method is tested to be effective, 基于情感种子词与so-pmi…☆86Updated 6 years ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆130Updated 4 years ago
- worddict crawler and transfer for sougpuinput wordict , 搜狗输入法词库抓取与格式转换☆25Updated 6 years ago
- Pre-trained ELECTRA from Hong Kong data☆27Updated 4 years ago
- self complemented BaiduIndexSpyder based on Selenium , index image decode and num image transfer,基于关键词的历时百度搜索指数自动采集☆41Updated 6 years ago
- 人民日报语料处理工具集 | Tools for Corpus of People's Daily☆274Updated last year
- Hanzi Converter for Traditional and Simplified Chinese☆183Updated 4 years ago
- COS960: A Chinese Word Similarity Dataset of 960 Word Pairs☆35Updated 5 years ago
- NLP预/后处理工具。☆29Updated 7 months ago
- 李傲龍的博客☆81Updated 7 months ago
- Neural Chinese Address Parsing☆119Updated 5 years ago
- 使用BiLSTM对人民日报语料进行分词☆56Updated 5 years ago
- This is a corpus of Chinese abbreviation, including negative full forms.☆191Updated 3 years ago
- Chinese stopwords collection☆132Updated 4 years ago
- Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。☆86Updated 6 years ago