geekan/scrapy-examples

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/geekan/scrapy-examples)

geekan / scrapy-examples

Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.

☆3,253

Alternatives and similar repositories for scrapy-examples

Users that are interested in scrapy-examples are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rmax / scrapy-redis
View on GitHub
Redis-based components for Scrapy.
☆5,643May 19, 2026Updated 2 months ago
gnemoug / distribute_crawler
View on GitHub
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
☆3,243Apr 18, 2017Updated 9 years ago
aivarsk / scrapy-proxies
View on GitHub
Random proxy middleware for Scrapy
☆1,669Oct 1, 2019Updated 6 years ago
scrapinghub / portia
View on GitHub
Visual scraping for Scrapy
☆9,508Jun 26, 2024Updated 2 years ago
scrapy / scrapyd
View on GitHub
A service daemon to run Scrapy spiders
☆3,099Jun 19, 2026Updated last month
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
scrapy / scrapy
View on GitHub
Scrapy, a fast high-level web crawling & scraping framework for Python.
☆63,165Updated this week
zhijunio / scrapy-zhihu-github
View on GitHub
scrapy examples for crawling zhihu and github
☆221Jan 11, 2023Updated 3 years ago
scrapy / dirbot
View on GitHub
Scrapy project to scrape public web directories (educational) [DEPRECATED]
☆1,627Oct 27, 2017Updated 8 years ago
binux / pyspider
View on GitHub
A Powerful Spider(Web Crawler) System in Python.
☆16,802Apr 30, 2024Updated 2 years ago
scrapy-plugins / scrapy-splash
View on GitHub
Scrapy+Splash for JavaScript integration
☆3,230Feb 11, 2025Updated last year
LiuXingMing / SinaSpider
View on GitHub
新浪微博爬虫（Scrapy、Redis）
☆3,284Sep 5, 2018Updated 7 years ago
istresearch / scrapy-cluster
View on GitHub
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
☆1,224Nov 7, 2023Updated 2 years ago
scrapinghub / scrapylib
View on GitHub
Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)
☆33Feb 22, 2018Updated 8 years ago
yoyzhou / weibo_scrapy
View on GitHub
WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.
☆155Jun 3, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
marchtea / scrapy_doc_chs
View on GitHub
scrapy中文翻译文档
☆1,104Sep 12, 2019Updated 6 years ago
DormyMo / SpiderKeeper
View on GitHub
admin ui for scrapy/open source scrapinghub
☆2,769May 4, 2023Updated 3 years ago
my8100 / scrapydweb
View on GitHub
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI.…
☆3,410Feb 19, 2025Updated last year
Andrew-liu / scrapy_example
View on GitHub
This repository store some example to learn scrapy better
☆176Oct 9, 2020Updated 5 years ago
sebdah / scrapy-mongodb
View on GitHub
MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the item…
☆358Apr 6, 2021Updated 5 years ago
wuchong / scrapy-dynamic-configurable
View on GitHub
A dynamic configurable news crawler based Scrapy
☆164Jul 24, 2017Updated 8 years ago
WSOL12 / Solana-Arbitrage-Bot
View on GitHub
Solana Arbitrage Bot on pump.fun, Meteora, Raydium and Orca using Jito bundling, RPC and gRPC. Solana Arbitrage Bot Solana Arbitrage Bot …
☆510Mar 17, 2026Updated 4 months ago
yidao620c / core-scrapy
View on GitHub
python-scrapy demo
☆805Oct 1, 2020Updated 5 years ago
mjhea0 / Scrapy-Samples
View on GitHub
Scrapy examples crawling Craigslist
☆199Apr 20, 2016Updated 10 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
holgerd77 / django-dynamic-scraper
View on GitHub
Creating Scrapy scrapers via the Django admin interface
☆1,157Feb 19, 2022Updated 4 years ago
immzz / zhihu-scrapy
View on GitHub
A scrapy zhihu crawler
☆77Nov 6, 2018Updated 7 years ago
cuanboy / ScrapyProject
View on GitHub
开始Scrapy实战如：存数据库、下载文件、爬京东、淘宝、Anti-Anti-Spider……
☆425Apr 22, 2018Updated 8 years ago
scrapinghub / scrapyrt
View on GitHub
HTTP API for Scrapy spiders
☆882Jun 29, 2026Updated 2 weeks ago
qinxuye / cola
View on GitHub
A high-level distributed crawling framework.
☆1,500Jul 31, 2022Updated 3 years ago
jackgitgz / CnblogsSpider
View on GitHub
用scrapy采集cnblogs列表页爬虫
☆274Jun 16, 2015Updated 11 years ago
xchaoinfo / fuck-login
View on GitHub
模拟登录一些知名的网站，为了方便爬取需要登录的网站
☆5,868Jun 8, 2018Updated 8 years ago
AccordBox / awesome-scrapy
View on GitHub
A curated list of awesome packages, articles, and other cool resources from the Scrapy community.
☆560Dec 28, 2022Updated 3 years ago
baabaaox / ScrapyDouban
View on GitHub
豆瓣电影/豆瓣读书 Scarpy 爬虫
☆791Dec 4, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
feiskyer / scrapy-examples
View on GitHub
Some scrapy and web.py exmaples
☆79May 20, 2017Updated 9 years ago
Gerapy / Gerapy
View on GitHub
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
☆3,505Jul 4, 2026Updated 2 weeks ago
scrapy / scrapyd-client
View on GitHub
Command line client for Scrapyd server
☆773Feb 27, 2026Updated 4 months ago
luyishisi / Anti-Anti-Spider
View on GitHub
越来越多的网站具有反爬虫特性，有的用图片隐藏关键数据，有的使用反人类的验证码，建立反反爬虫的代码仓库，通过与不同特性的网站做斗争（无恶意）提高技术。（欢迎提交难以采集的网站）（因工作原因，项目暂停）
☆7,286Oct 17, 2021Updated 4 years ago
grangier / python-goose
View on GitHub
Html Content / Article Extractor, web scrapping lib in Python
☆4,100Mar 10, 2026Updated 4 months ago
fxsjy / jieba
View on GitHub
结巴中文分词
☆35,062Aug 21, 2024Updated last year
egrcc / zhihu-python
View on GitHub
获取知乎内容信息，包括问题，答案，用户，收藏夹信息
☆2,331Feb 8, 2022Updated 4 years ago