Insutanto / scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
☆54Updated last year
Related projects: ⓘ
- A tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.☆88Updated 2 years ago
- An intelligent web service to automatically detect web content and extract information from it.☆84Updated last year
- Distributed crawling/scraping, Kafka And Redis based components for Scrapy☆45Updated 3 years ago
- Scrapy + Puppeteer☆110Updated 3 years ago
- 基于httpx的一个大型项目 ,爬取黑胶唱片网站 Discogs☆101Updated last year
- A chrome extension to get XPath of list items in webpage easily.☆35Updated 2 years ago
- Implement scrapy with asyncio☆56Updated 3 weeks ago
- rabbitmq的scrapy分布式爬虫☆34Updated 3 years ago
- Downloader Middleware to support Playwright in Scrapy & Gerapy☆106Updated 2 years ago
- Tinepeas,我们自己的爬虫框架。☆63Updated last month
- 爬虫管理系统,支持集群,弹性伸缩。支持运行feapder、scrapy、selenium、playwright等各种框架及脚本☆99Updated 5 months ago
- Scrapy Redis Bloom Filter☆173Updated 3 years ago
- Distributed task redisqueue(最简单python分布式函数调度框架)☆63Updated 11 months ago
- 知乎登录☆22Updated 5 years ago
- Auto Extractor Module☆319Updated last month
- Downloader Middleware to support Pyppeteer in Scrapy & Gerapy☆137Updated 2 years ago
- Use pyppeteer from a Scrapy spider☆60Updated 4 years ago
- ☆50Updated 10 months ago
- scrapy-redis-sentinel 基于 scrapy-redis 的基础上 新增 哨兵(sentinel)连接模式 以及 集群(cluster)连接模式。☆30Updated last year
- Python client for Redisbloom☆76Updated last year
- SDK for Crawlab, including SDK for different programming languages such as Python, Node.js and Java, and a CLI Tool written in Python.☆55Updated 3 months ago
- Pyppeteer integration for Scrapy☆60Updated 3 years ago
- feapder的管道扩展☆16Updated last year
- Scrapy stats exporter for prometheus☆17Updated last year
- scrapy-redis的集群版,可以借助Redis集群实现海量网站的独立去重,避免单机内存不足的尴尬☆138Updated last year
- Backend core modules for Crawlab☆48Updated 3 months ago
- Scrapy Redis with Bloom Filter,support redis sentinel and cluster☆23Updated last year
- pip install universal_object_pool ,万能通用对象池,可以池化任意自定义类型的对象。☆17Updated last year
- 国家药品监督管理局某数版本(FSSBBIl1UgzbN7N82T)☆56Updated 2 years ago
- 爬虫监控及可视化 ( Prometheus and Grafana ) Building a crawler with distributed task queues (Celery) and fetching data with a reliable monitor sy…☆44Updated last year