wuchong/scrapy-dynamic-configurable

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wuchong/scrapy-dynamic-configurable)

wuchong / scrapy-dynamic-configurable

A dynamic configurable news crawler based Scrapy

☆164

Alternatives and similar repositories for scrapy-dynamic-configurable

Users that are interested in scrapy-dynamic-configurable are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jackgitgz / CnblogsSpider
View on GitHub
用scrapy采集cnblogs列表页爬虫
☆274Jun 16, 2015Updated 11 years ago
gnemoug / distribute_crawler
View on GitHub
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
☆3,242Apr 18, 2017Updated 9 years ago
zhijunio / scrapy-zhihu-github
View on GitHub
scrapy examples for crawling zhihu and github
☆221Jan 11, 2023Updated 3 years ago
miracledan / raindrop-spider
View on GitHub
A simple distribute spider based on scrapy framework.
☆26Oct 22, 2015Updated 10 years ago
yoyzhou / weibo_scrapy
View on GitHub
WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.
☆155Jun 3, 2026Updated last month
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yinzishao / NewsScrapy
View on GitHub
基于scrapy的新闻爬虫
☆101Apr 18, 2020Updated 6 years ago
kohn / HttpProxyMiddleware
View on GitHub
A middleware for scrapy. Used to change HTTP proxy from time to time.
☆323Feb 1, 2018Updated 8 years ago
geekan / scrapy-examples
View on GitHub
Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.
☆3,254Nov 3, 2023Updated 2 years ago
bluedazzle / multithreading-spider
View on GitHub
a simple demo use threading and queue get proxies from proxy sites
☆17Mar 29, 2016Updated 10 years ago
stummjr / HackerNewsDailyDigest
View on GitHub
A toy project with Scrapy + Django + Celery to run on Heroku
☆13Sep 8, 2015Updated 10 years ago
Andrew-liu / scrapy_example
View on GitHub
This repository store some example to learn scrapy better
☆175Oct 9, 2020Updated 5 years ago
hailong0707-zz / spider_news_all
View on GitHub
Scrapy Spider for 各种新闻网站
☆109Sep 3, 2015Updated 10 years ago
KeithYue / Zhihu_Spider
View on GitHub
Scrapy the Zhihu content and user social network information
☆46Feb 15, 2014Updated 12 years ago
flisky / scrapy-phantomjs-downloader
View on GitHub
PhantomJS Downloader for Scrapy, Yeah!
☆93Aug 11, 2014Updated 11 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
brandicted / scrapy-webdriver
View on GitHub
☆143Nov 24, 2015Updated 10 years ago
pelick / VerticleSearchEngine
View on GitHub
Academic Search Engine using Scrapy, MongoDB, Lucene/Solr, Tika, Struts2, Jquery, Bootstrap, D3, CAS
☆101Jun 16, 2013Updated 13 years ago
chenqx / spiderDemo
View on GitHub
☆23Jan 31, 2015Updated 11 years ago
Glacier759 / Get_58_Data
View on GitHub
利用WebMagic框架进行58同城数据的抓取
☆12Oct 13, 2014Updated 11 years ago
scrapy-plugins / scrapy-jsonrpc
View on GitHub
Scrapy extension to control spiders using JSON-RPC
☆299Aug 26, 2019Updated 6 years ago
AmbientLighter / rpn-fas
View on GitHub
Parser for open government data
☆29Jun 4, 2018Updated 8 years ago
python-cn / flask-slackbot
View on GitHub
flask_slackbot helps you deal with slack outgoing webhook.
☆22Jun 24, 2015Updated 11 years ago
holgerd77 / django-dynamic-scraper
View on GitHub
Creating Scrapy scrapers via the Django admin interface
☆1,158Feb 19, 2022Updated 4 years ago
yeuchi / STLDecoder
View on GitHub
3D format: STL file decoder for javascript
☆20Aug 31, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facert / scrapy_helper
View on GitHub
Dynamic configurable crawl (动态可配置化爬虫)
☆87Jan 13, 2018Updated 8 years ago
rauls / nodejs-pack
View on GitHub
NodeJS module - Pack/Unpack commands identical to Perl and Php for nodejs. Code is based from php5 pack.c . This should be about 10 to 20…
☆20Sep 24, 2014Updated 11 years ago
armysheng / tech163newsSpider
View on GitHub
爬取网易新闻，存储到本地的mongodb
☆42Jan 7, 2015Updated 11 years ago
dafyddcrosby / python-bcd
View on GitHub
A binary-coded decimal conversion library for Python
☆11Feb 13, 2018Updated 8 years ago
immzz / zhihu-scrapy
View on GitHub
A scrapy zhihu crawler
☆77Nov 6, 2018Updated 7 years ago
feiskyer / scrapy-examples
View on GitHub
Some scrapy and web.py exmaples
☆79May 20, 2017Updated 9 years ago
grangier / python-goose
View on GitHub
Html Content / Article Extractor, web scrapping lib in Python
☆4,101Mar 10, 2026Updated 4 months ago
scrapy / dirbot
View on GitHub
Scrapy project to scrape public web directories (educational) [DEPRECATED]
☆1,628Oct 27, 2017Updated 8 years ago
jxltom / scrapymon
View on GitHub
Simple Web UI for Scrapy spider management via Scrapyd
☆50Jun 25, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
qinxuye / cola
View on GitHub
A high-level distributed crawling framework.
☆1,500Jul 31, 2022Updated 3 years ago
maxliaops / scrapy-itzhaopin
View on GitHub
☆94Apr 28, 2014Updated 12 years ago
jindongwang / informationretrieval
View on GitHub
信息检索检索器的Java实现
☆18Oct 10, 2017Updated 8 years ago
Glacier759 / newsEyeSpider
View on GitHub
抓取各报社报纸信息－采用配置文件形式实现的一个简单的可定制爬虫
☆11Sep 1, 2022Updated 3 years ago
aivarsk / scrapy-proxies
View on GitHub
Random proxy middleware for Scrapy
☆1,669Oct 1, 2019Updated 6 years ago
andrebq / gas
View on GitHub
GAS is a go library to load assets from within GOPATH
☆29Jul 12, 2014Updated 12 years ago
TeamHG-Memex / arachnado
View on GitHub
Web Crawling UI and HTTP API, based on Scrapy and Tornado
☆162Apr 8, 2026Updated 3 months ago