yoghurtjia/Zhihu_bigdata

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yoghurtjia/Zhihu_bigdata)

yoghurtjia / Zhihu_bigdata

使用scrapy和pandas完成对知乎300w用户的数据分析。首先使用scrapy爬取知乎网的300w，用户资料，最后使用pandas对数据进行过滤，找出想要的知乎大牛，并用图表的形式可视化。

☆159

Alternatives and similar repositories for Zhihu_bigdata

Users that are interested in Zhihu_bigdata are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhijunio / scrapy-zhihu-github
View on GitHub
scrapy examples for crawling zhihu and github
☆221Jan 11, 2023Updated 3 years ago
ansenhuang / scrapy-zhihu-users
View on GitHub
scrapy爬取知乎用户数据
☆153Apr 11, 2016Updated 10 years ago
egrcc / zhihu-python
View on GitHub
获取知乎内容信息，包括问题，答案，用户，收藏夹信息
☆2,333Feb 8, 2022Updated 4 years ago
luckyJerryChen / GVBD
View on GitHub
大规模社交数据可视化分析工具
☆19Sep 18, 2016Updated 9 years ago
LiuXingMing / Tmall1212
View on GitHub
天猫双12爬虫，附商品数据。
☆202Dec 12, 2016Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
maxliaops / scrapy-itzhaopin
View on GitHub
☆94Apr 28, 2014Updated 12 years ago
MiYogurt / NicoUI
View on GitHub
📐 一个纯 CSS UI 框架，这是一个教程的输出，教你写一个自己的 CSS 框架。
☆12Jul 23, 2018Updated 8 years ago
julyclyde / david-mysql-tools
View on GitHub
Automatically exported from code.google.com/p/david-mysql-tools
☆11Mar 1, 2016Updated 10 years ago
adasilva / intro_kivy
View on GitHub
A project-based introduction to kivy.
☆18Oct 3, 2014Updated 11 years ago
PHP-Zebra / Zebra-MergeTable
View on GitHub
基于PHP实现的大表水平拆分，类似mysql合并表
☆16Nov 29, 2014Updated 11 years ago
yoyzhou / weibo_scrapy
View on GitHub
WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.
☆155Jun 3, 2026Updated last month
hs-web / hsweb-code-generator
View on GitHub
新版代码生成器
☆10Apr 19, 2018Updated 8 years ago
aboutstudy / Web-Security-Filter
View on GitHub
WEB常见SQL注入与跨站攻击过滤函数，支持SQL注入，跨站脚本攻击和跨站POST提交等常见安全过滤功能。
☆16Oct 30, 2012Updated 13 years ago
wosiwo / ServerMonitor-server
View on GitHub
使用php的swoole扩展来实现，监控服务器信息，并提供sokect访问接口；
☆17Oct 29, 2014Updated 11 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
hating / ZhihuTrend
View on GitHub
知乎大数据分析与热点生成。
☆331Mar 21, 2017Updated 9 years ago
yoghurtjia / -python-BAT-
View on GitHub
针对常见的BAT公司中的大数据面试和笔试问题，列出解决思路，并使用python来实现
☆193Sep 30, 2017Updated 8 years ago
liyuefeilong / Grab-Ticket
View on GitHub
基于Python的12306抢票工具
☆11Apr 11, 2016Updated 10 years ago
julien-duponchelle / scrapy-elasticsearch
View on GitHub
A scrapy pipeline which send items to Elastic Search server
☆97Jan 2, 2018Updated 8 years ago
MatrixSeven / ZhihuSpider
View on GitHub
知乎爬虫/可以爬出关注关系的爬虫
☆306Jun 7, 2025Updated last year
da2vin / Spider_index
View on GitHub
爬取百度指数和阿里指数，采用selenium，存入hbase，验证码自动识别，多线程控制
☆32Dec 11, 2016Updated 9 years ago
unm-art / WMS-Labeling
View on GitHub
Custom application to add label printing capabilities to OCLC's WMS system.
☆14Jul 20, 2026Updated last week
DNSPod / dnspod-monitor-callback-php
View on GitHub
DNSPod 宕机监控 URL 回调 PHP 服务端示例
☆19Jan 22, 2019Updated 7 years ago
MorganZhang100 / zhihu-spider
View on GitHub
A web spider for zhihu.com
☆720Jan 17, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Tachone / zhihu_spider
View on GitHub
large-scale user information crawler of zhihu
☆77May 10, 2017Updated 9 years ago
cw1997 / Tieba-Posting-Frequency
View on GitHub
百度贴吧发帖频率统计以及贴吧帖子热门关键词统计
☆34Jul 6, 2017Updated 9 years ago
wuchong / scrapy-dynamic-configurable
View on GitHub
A dynamic configurable news crawler based Scrapy
☆164Jul 24, 2017Updated 9 years ago
djyde / cown
View on GitHub
Qiniu bucket manager and uploader
☆12May 9, 2016Updated 10 years ago
lingfan / gop
View on GitHub
游戏运营平台 Game Operating Platform
☆26May 16, 2016Updated 10 years ago
li914 / workerman.chat.04.com
View on GitHub
基于thinkphp5.1和workerman框架,HTML5的websocket即时通讯
☆14Jan 8, 2019Updated 7 years ago
wanghan0501 / UserSessionBehaviorOfflineAnalysis
View on GitHub
四川大学拓思爱诺用户session行为数据离线分析项目
☆68Jul 1, 2022Updated 4 years ago
sjhfx / rwda
View on GitHub
R微博数据分析
☆13Aug 5, 2019Updated 6 years ago
wwj718 / ibot
View on GitHub
为命令行火车票查询器添加自然语言交互界面
☆60Aug 3, 2016Updated 9 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
shencan / Ansible-Book-Code
View on GitHub
☆14Nov 14, 2016Updated 9 years ago
goozp / ths-spider-example
View on GitHub
完整的 scrapy 爬虫示例，爬取股票和新闻数据
☆17Aug 15, 2020Updated 5 years ago
7sDream / zhihu-py3
View on GitHub
[不再维护] 后继者 zhihu-oauth https://github.com/7sDream/zhihu-oauth 已被 DMCA，亦不再开发，仅提供代码存档：
☆1,034Sep 17, 2016Updated 9 years ago
hailong0707-zz / spider_news_all
View on GitHub
Scrapy Spider for 各种新闻网站
☆109Sep 3, 2015Updated 10 years ago
oeljeklaus-you / LogAnalyzeHelper
View on GitHub
论坛日志分析系统清洗程序(包含IP规则库，UDF开发，MapReduce程序，日志数据)
☆32May 18, 2018Updated 8 years ago
wxzher / Python-Crawl-and-Analysis-for-Bilibili-Videos
View on GitHub
此项目为使用python语言对B站高考视频数据进行挖掘和可视化
☆14Nov 15, 2022Updated 3 years ago
lustlost / ubackup
View on GitHub
此系统解决游族2w+个数据库实例，日均大概40w+个备份文件，40TB+数据量（包括mysql,redis,ssdb）的异地灾备
☆159Aug 15, 2016Updated 9 years ago