StuPeter / Sougou_dict_spider
搜狗词库爬虫,全类目下载,自动分类,scel转txt
☆205Updated 9 months ago
Alternatives and similar repositories for Sougou_dict_spider:
Users that are interested in Sougou_dict_spider are comparing it to the libraries listed below
- 中文预处理语料☆107Updated 6 years ago
- THUOCL(THU Open Chinese Lexicon)中文词库☆895Updated last year
- 下载搜狗、百度、QQ输入法的词库文件的 python 爬虫,可用于构建不同行业的词汇库☆112Updated 7 years ago
- 【星】pdf扫描件 转 docx☆48Updated 5 years ago
- This is a corpus of Chinese abbreviation, including negative full forms.☆191Updated 3 years ago
- 近代汉语语料库数据集 自然语言处理 语料库 古代汉语 古汉语 文言文 数字人文 计算语言☆152Updated 2 months ago
- 由搜狗细胞词库生成的谷歌拼音输入法词典 A dict for Google Pinyin Input, exported from Sougou Pinyin Input.☆61Updated 8 years ago
- Lexicon for Chinese lexical analyzing, 中文语言分词词库☆118Updated 3 years ago
- 简体中文词库包含词频+注音;特殊符号词库包含希腊字母,部分数学符号,Emoji表情,序号等.☆76Updated 2 years ago
- 汉字自动拆分系统开发☆102Updated last year
- ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典,包括34万抽象语义库、34万反义语义库、43万同义语义库等,可支持句子扩展、转写、事件抽象与泛化等多种应用场景。☆749Updated last year
- NLU is hard!!!☆270Updated 5 years ago
- 《现代汉语词典》(第7版)全文TXT☆258Updated 7 months ago
- 汉字五笔转换工具☆33Updated 6 years ago
- 《现代汉语词典》第 7 版的 mdict/mdx 资源。☆183Updated 2 years ago
- 古汉语(文言文)字典-爬取文言文字典网,制作Kindle字典.☆65Updated 6 years ago
- 汉语古典文本资料库☆265Updated 7 years ago
- DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。…☆678Updated 3 years ago
- 中文相关词典和语料库。☆169Updated 10 years ago
- 古诗词语料库☆125Updated 7 years ago
- 五笔字型超大字符集编码数据库☆87Updated 2 years ago
- An exploration for Eventline (important news Rank organized by pulic time),针对某一事件话题下的新闻报道集合,通过使用docrank算法,对新闻报道进行重要性识别,并通过新闻报道时间挑选出时间线上重要…☆215Updated 6 years ago
- An collection of Chinese nlp corpus including basic Chinese syntatic wordset, semantic wordset, historic corpus and evaluate corpus. 中文自然…☆440Updated 6 years ago
- 常用的中文停用词表☆73Updated 6 years ago
- 转换搜狗拼音词库为txt文件☆51Updated 7 years ago
- 各大中文分词性能评测☆155Updated 6 years ago
- 今日头条中文新闻文本(多层)分类数据集☆392Updated 3 years ago
- 中华人民共和国法律手册 - 一个 Android 端的阅读器☆94Updated last year
- 中文、分词、词表、核心词典、事件词表、停用词、敏感词、问答、问答数据、知识图谱、文本语料。☆152Updated 3 years ago
- 人民日报语料处理工具集 | Tools for Corpus of People's Daily☆274Updated last year