hailinli / crawGovDataLinks
爬取政府网站的数据(赣州、吐鲁番、大理、太原、大庆)
☆32Updated 6 years ago
Alternatives and similar repositories for crawGovData
Users that are interested in crawGovData are comparing it to the libraries listed below
Sorting:
- 从门户网站爬取新闻的摘要-标题对使用seq2seq根据摘要生成标题☆45Updated 8 years ago
- 爬取百度贴吧、TapTap、appstore、微博官方博主上的游戏评论(基于redis_scrapy),过滤器采用了bloomfilter。☆55Updated 6 years ago
- self complemented BaiduIndexSpyder based on Selenium , index image decode and num image transfer,基于关键词的历时百度搜索指数自动采集☆42Updated 7 years ago
- self complemented WeiboIndexSpyder based on Selenium ,新浪微博指数(微指数)采集,包括综合指数,移动端指数,PC端指数☆31Updated 7 years ago
- chinese anti semantic word search interface based on dict crawled from online resources, ChineseAntiword,针对中文词语的反义词查询接口☆59Updated 6 years ago
- 智能客服☆105Updated 6 years ago
- worddict crawler and transfer for sougpuinput wordict , 搜狗输入法词库抓取与格式转换☆25Updated 7 years ago
- Self complemented text feature extraction using algorithms including CHI, DF, IG, MI for the experiment of text classification based on s…☆49Updated 7 years ago
- 时间关键词正则提取以及标准化☆21Updated 3 years ago
- 文本对关系比较 - 语义相似度、字面相似度、文本蕴含等等☆55Updated 5 years ago
- 文本分类基准测试☆25Updated 7 years ago
- 金庸小说人物关系图谱构建☆63Updated 5 years ago
- 针对口语进行时间抽取并标准化☆13Updated 5 years ago
- 企业事件抽取☆14Updated 4 years ago
- 极简爬虫工作流☆41Updated 2 years ago
- Qimen表示的是奇门遁甲之术,用于抽取各种实体的工具。☆29Updated 5 years ago
- 百度百科爬虫☆72Updated last year
- 微调预训练语言模型(BERT、Roberta、XLBert等),用于计算两个文本之间的相似度(通过句子对分类任务转换),适用于中文文本☆89Updated 4 years ago
- 学习笔记☆17Updated 5 years ago
- Quick run NLP in many task 快速运行分类、序列标注、匹配、生成等NLP任务的Tensorflow框架 (中文 NLP 支持分布式)☆30Updated 4 years ago
- 常用的中文停用词表☆77Updated 7 years ago
- lightsmile个人的用于爬取网络公开语料数据的mini通用爬虫框架。☆12Updated 4 years ago
- self complemented SpellCorrection based pinyin similairity, edit distance ,基于拼音相似度与编辑距离的查询纠错。☆83Updated 3 years ago
- ☆58Updated 3 years ago
- Self complemented sentiment words expansion using seed sentiment words and so-pmi , this method is tested to be effective, 基于情感种子词与so-pmi…☆87Updated 7 years ago
- 公司、企业名称模糊匹配,基于词频的公司名主体提取,基于编辑距离的匹配度☆41Updated 4 years ago
- CCKS2019面向金融领域的事件主体抽取☆46Updated 6 years ago
- gensim-fast2vec改造、灵活使用大规模外部词向量(具备OOV查询能力)☆22Updated 6 years ago
- 海量中文文本快速查重☆17Updated 6 years ago
- rasa_chinese 的服务 package☆18Updated 4 years ago