mmlzhang / cnki_patent
中国知网专利爬虫
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for cnki_patent
- [数据+代码] 上市公司年报文本分词、关键词词频统计+数字化转型关键词表☆26Updated 2 years ago
- 自制Python玩具小爬虫,用来爬取失信被执行人、专利等数据☆18Updated 4 years ago
- 专利爬虫,基于request模块的爬虫,保存格式为csv☆12Updated 7 years ago
- 文本分类是指在给定分类体系下 , 根据文本的内容自动确定文本类别的过程。首先我们根据scrapy爬虫根据中国知网URL的规律,爬取70多万条2014年公开的发明专利,然后通过数据清洗筛选出了60多万条含标签数据。通过TF-IDF对60多万条本文进行词频提取,依照词频排序提取…☆104Updated 6 years ago
- Code Repository for MS20190155☆141Updated 7 months ago
- 爬取专利信息的爬虫☆27Updated 8 years ago
- 使用SO_PMI互信息算法、词向量法快速构建不同领域(手机、汽车等)的专业情感词典☆89Updated 3 years ago
- 知网爬虫cnkispider,输入关键字爬取知网检索数据☆32Updated 6 years ago
- Public Behavior Analysis under the COVID-19 Emergency——Based on Weibo Mining☆10Updated 3 years ago
- https://github.com/jcgcarranza/respol_patents_code☆29Updated 4 years ago
- 爬取谷歌专利☆8Updated 5 years ago
- 裁判文书数据-增量更新☆37Updated 4 years ago
- 使用Python构建共现矩阵,并以三元组形式存储到csv文件。☆51Updated 5 years ago
- 本项目爬取各省市政府工作报告,试图通过聚类、主题分类等将它们识别区分开来。☆11Updated 5 years ago
- 使用中文情感词汇本体库进行情感分析,之后对每种情感的文本进行主题分析。Using Chinese Sentiment Dictionary for Sensitive Analysis, Then applying LDA Topic Analysis for each E…☆13Updated 3 years ago
- 专利信息及全文下载☆18Updated last year
- This repository is used to provide some useful data file and do file of Stata to reader.☆35Updated 5 years ago
- 基于数据新闻需求,分析openlaw裁判文书的工具。☆39Updated last year
- 【微信公众号:大邓和他的python】, Python语法快速入门https://www.bilibili.com/video/av44384851 Python网络爬虫快速入门https://www.bilibili.com/video/av72010301…☆97Updated 2 years ago
- The crawler for data on web of science, especially focus on the analysis of citation data☆13Updated 5 years ago
- ☆26Updated last year
- 抓取百度指数,需求图谱以及人群画像☆21Updated 2 years ago
- 中文文本挖掘lda模型,gensim+jieba库☆16Updated 5 years ago
- 中国知网爬虫☆144Updated 7 years ago
- NJU Master Course **Big Data Mining and Analysis**☆127Updated 2 years ago
- 简单的年报分析工具☆35Updated 7 years ago
- This repository provides the replication code and data for Kogan, L., Papanikolaou, D., Seru, A. and Stoffman, N., QJE 2017.☆30Updated 3 years ago
- ☆11Updated last year