QiuMing/zhihuWebSpider

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/QiuMing/zhihuWebSpider)

QiuMing / zhihuWebSpider

知乎爬虫，基于webmagic框架 .A java web spider base on webmagic.

☆69

Alternatives and similar repositories for zhihuWebSpider

Users that are interested in zhihuWebSpider are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jadetang / maliang
View on GitHub
Maliang is a code generator for J2EE project
☆13Jun 14, 2023Updated 3 years ago
webmagic-io / jobhunter
View on GitHub
使用WebMagic抓取招聘信息，并且持久化到Mysql的例子。
☆220Nov 22, 2016Updated 9 years ago
liyifeng1994 / webmagic-csdnblog
View on GitHub
基于WebMagic写的一个csdn博客小爬虫
☆91Jun 7, 2018Updated 8 years ago
DSLZC / distributelock-spring-boot-starter
View on GitHub
spring boot 分布式锁starter
☆11Oct 31, 2018Updated 7 years ago
YaboSu / zhihu_crawler
View on GitHub
使用python 3实现的一个知乎内容的爬虫，依赖requests、BeautifulSoup4。
☆38May 9, 2016Updated 10 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
zhitongjob / atoms
View on GitHub
一个支持多级缓存的分布式缓存系统
☆21Dec 27, 2017Updated 8 years ago
wycm / zhihu-crawler
View on GitHub
zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目
☆919Apr 2, 2019Updated 7 years ago
eastseven / spring-cloud-demo
View on GitHub
Spring Cloud 微服务 Demo
☆11Aug 1, 2024Updated last year
yuki-lau / weibo-spider
View on GitHub
新浪微博爬虫，采用Java语言开发，基于HTTPClient 4.0，采用MySQL存储爬取数据，支持多进程并发执行。功能包括：爬取微博、评论、转发、关注列表（层次）。根据数据需求，持续更新...
☆354Feb 27, 2014Updated 12 years ago
madpudding / RelationshipCrawler
View on GitHub
知网、万方、专利局爬虫
☆11Mar 20, 2019Updated 7 years ago
scsfwgy / WebMagic_CSDN_Demo
View on GitHub
CSDN官网各种信息抓取，采用jsoup/webMagic进行实现，各种爬虫策略的处理，可以很好的进行爬虫学习。【本项目只提供核心爬虫程序，不包含其它业务逻辑处理】【停止维护】
☆53Feb 15, 2018Updated 8 years ago
chmod740 / BaiduBaikeSpider
View on GitHub
百度百科多线程爬虫Java源码，数据存储采用了Oracle11g
☆13Feb 23, 2017Updated 9 years ago
Flowingsun007 / house_spider
View on GitHub
Lianjia house spider链家二手房爬虫~ Springboot + Webmagic + Mysql + Redis
☆27Apr 22, 2021Updated 5 years ago
CrawlScript / WeiboLoginTool
View on GitHub
基于WebCollector的新浪微博爬虫及相关登录工具，如新浪微博Cookie获取
☆14Nov 21, 2018Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
seawaylee / spark-rec-v2
View on GitHub
Spark混合推荐系统大数据监控平台
☆11May 1, 2018Updated 8 years ago
dmengelt / spring-boot-with-aspectj-and-lombok
View on GitHub
A simple Spring Boot project to show that AspectJ + Lombok does work with mvn but not within IntelliJ:
☆10Jan 8, 2019Updated 7 years ago
super-l / supurl
View on GitHub
新一代的关键词URL采集系统,采用GO语言开发。可突破搜索引擎的反爬虫机制！根据用户录入的关键词，批量自动化使用主流多个搜索引擎进行采集与统一处理。支持精准采集与大规模深度采集(自动采集相关词),日采集可轻松千万条不重复域名。
☆11Jun 7, 2022Updated 4 years ago
seawaylee / doubanWebSpider
View on GitHub
豆瓣爬虫爬取热门标签、图书信息、图书评论系统架构 Webmagic+SSM+Redis+Mysql+ActiveMQ+Druid
☆44Apr 24, 2019Updated 7 years ago
k9982874 / stack
View on GitHub
a distributed file storage service
☆11Nov 22, 2016Updated 9 years ago
egoist / unlinkable
View on GitHub
Prevent Twitter from auto-converting URL and @mention, #hashtag into links.
☆20Dec 8, 2022Updated 3 years ago
hncdyj123 / mybatis-generator
View on GitHub
比mybatis-generator更牛逼的生成工具，基本做到稍作前端js代码修改就是一个项目。
☆17Dec 16, 2022Updated 3 years ago
cn27001 / lua-resty-waf
View on GitHub
开源waf web 防火墙
☆10Nov 30, 2021Updated 4 years ago
xiaoyang611 / crawler-denfender
View on GitHub
反网页爬虫系统
☆39Mar 10, 2015Updated 11 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
realxujiang / paper
View on GitHub
Computer Foundations Practices
☆20Apr 1, 2021Updated 5 years ago
djunny / classifier4php
View on GitHub
基于 PHP 和 word2vec 的分类器，用于文章、新闻等内容自动分类，项目包含样本训练、识别代码，分词组件用的是 PhpAnalysis，简单灵活。欢迎大家一起优化并完善。
☆12Nov 22, 2019Updated 6 years ago
feizhiwu / play
View on GitHub
一些后台开发中常用的活动算法，大转盘，翻牌，刮刮卡，抢红包，洗牌 and so on ...
☆12Dec 27, 2019Updated 6 years ago
DaiDongLiang / DSC
View on GitHub
Distributed SDN Controller
☆10Mar 16, 2016Updated 10 years ago
letcheng / ProxyPool
View on GitHub
针对反爬虫问题的自动代理池组件
☆79Mar 4, 2017Updated 9 years ago
data-integrations / salesforce
View on GitHub
Salesforce plugins
☆12Updated this week
chideat / pcc
View on GitHub
后花园学习项目
☆10Mar 23, 2017Updated 9 years ago
kevinhqf / DesignPattern
View on GitHub
Java Design Pattern Demo
☆30Aug 10, 2016Updated 9 years ago
pengcgithub / platform-springboot
View on GitHub
以CMS为业务需求构建的springBoot项目
☆13Sep 27, 2018Updated 7 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
JFanZhao / spider
View on GitHub
使用java+httpclient+httpcleaner，多线程、分布式爬去电商网站商品信息，数据存储在hbase上，并使用solr对商品建立索引，使用redis队列存储一个共享的url仓库；使用zookeeper对爬虫节点生命周期进行监视等。
☆236Nov 6, 2020Updated 5 years ago
zhvqee / micro-service-integration
View on GitHub
spring cloud 测试工程
☆11Jan 29, 2019Updated 7 years ago
jonas-werner / EdgeX_Foundry-Device_Creation
View on GitHub
Python script for creating new temp and humidity sensor device from scratch
☆11Dec 3, 2019Updated 6 years ago
eson15 / MyJavaStorehouse
View on GitHub
我的Java知识小仓库
☆15Aug 24, 2016Updated 9 years ago
aikuyun / Flink-Forward-Asia-2019
View on GitHub
Flink Forward Asia 2019 PPT以及视频资料
☆37Dec 20, 2019Updated 6 years ago
ZhangJiupeng / Delta
View on GitHub
一个轻量的MVC框架，上手简单，适用于小型JavaWeb项目快速开发
☆12Jun 13, 2016Updated 10 years ago
codingdie / Digger
View on GitHub
一个集分布式爬虫,分布式存储,分布式计算统计分析一体的统计分析数据挖掘项目
☆14Feb 6, 2018Updated 8 years ago