ferventdesert/etlpy

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ferventdesert/etlpy)

ferventdesert / etlpy

a smart stream-like crawler & etl python library

☆416

Alternatives and similar repositories for etlpy

Users that are interested in etlpy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ferventdesert / Hawk-Projects
View on GitHub
Project configurations of Hawk and etlpy. xml-format workflow define
☆149Jan 7, 2019Updated 7 years ago
ferventdesert / Hawk
View on GitHub
visualized crawler & ETL IDE written with C#/WPF
☆3,220Dec 21, 2019Updated 6 years ago
FullerHua / gooseeker
View on GitHub
☆694Oct 26, 2016Updated 9 years ago
gnemoug / distribute_crawler
View on GitHub
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
☆3,243Apr 18, 2017Updated 9 years ago
binux / pyspider
View on GitHub
A Powerful Spider(Web Crawler) System in Python.
☆16,803Apr 30, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
qinxuye / cola
View on GitHub
A high-level distributed crawling framework.
☆1,501Jul 31, 2022Updated 3 years ago
qiyeboy / IPProxyPool
View on GitHub
IPProxyPool代理池项目，提供代理ip
☆4,276Jul 13, 2018Updated 7 years ago
gangtao / dataplay2
View on GitHub
A simple data analysis software
☆284May 9, 2018Updated 8 years ago
rmax / scrapy-redis
View on GitHub
Redis-based components for Scrapy.
☆5,641May 19, 2026Updated last month
liujiannong / StockSenatus
View on GitHub
微信小程序，收集感兴趣的股票信息集中呈现，个人决策用。
☆11Dec 4, 2016Updated 9 years ago
VAllens / Obfuscation.Fody
View on GitHub
Fody extension to modify ObfuscationAttribute
☆10Feb 23, 2022Updated 4 years ago
LiuXingMing / captcha_identify
View on GitHub
提供验证码识别接口
☆15May 30, 2018Updated 8 years ago
wuchong / scrapy-dynamic-configurable
View on GitHub
A dynamic configurable news crawler based Scrapy
☆164Jul 24, 2017Updated 8 years ago
aosabook / 500lines
View on GitHub
500 Lines or Less
☆29,579Aug 19, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
LiuXingMing / SinaSpider
View on GitHub
新浪微博爬虫（Scrapy、Redis）
☆3,283Sep 5, 2018Updated 7 years ago
xianhu / PSpider
View on GitHub
简单易用的Python爬虫框架，QQ交流群：597510560
☆1,841Jun 10, 2022Updated 4 years ago
scrapinghub / portia
View on GitHub
Visual scraping for Scrapy
☆9,503Jun 26, 2024Updated 2 years ago
dinp / builder
View on GitHub
DINP编配中心，把用户代码打包为Docker image
☆10Feb 8, 2015Updated 11 years ago
waditu / tushare
View on GitHub
TuShare is a utility for crawling historical data of China stocks
☆15,222Mar 13, 2024Updated 2 years ago
douban / dpark
View on GitHub
Python clone of Spark, a MapReduce alike framework in Python
☆2,663Dec 25, 2020Updated 5 years ago
LiuRoy / spider_docker
View on GitHub
为爬虫引用创建container，包括的模块：scrapy, mongo, celery, rabbitmq
☆37Mar 22, 2016Updated 10 years ago
chyroc / WechatSogou
View on GitHub
基于搜狗微信搜索的微信公众号爬虫接口
☆6,326Mar 7, 2026Updated 4 months ago
my8100 / scrapydweb
View on GitHub
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI.…
☆3,409Feb 19, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LiuXingMing / cnn_on_captcha
View on GitHub
验证码CNN识别（学库宝）
☆16May 30, 2018Updated 8 years ago
istresearch / scrapy-cluster
View on GitHub
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
☆1,224Nov 7, 2023Updated 2 years ago
jackgitgz / CnblogsSpider
View on GitHub
用scrapy采集cnblogs列表页爬虫
☆274Jun 16, 2015Updated 11 years ago
gtbotsonar / analyse-plugin-lua
View on GitHub
bot analyze openresty plugins
☆13May 8, 2019Updated 7 years ago
luyishisi / Anti-Anti-Spider
View on GitHub
越来越多的网站具有反爬虫特性，有的用图片隐藏关键数据，有的使用反人类的验证码，建立反反爬虫的代码仓库，通过与不同特性的网站做斗争（无恶意）提高技术。（欢迎提交难以采集的网站）（因工作原因，项目暂停）
☆7,287Oct 17, 2021Updated 4 years ago
jiehua233 / ipproxy
View on GitHub
代理IP提取工具
☆115Sep 7, 2017Updated 8 years ago
zlzforever / ezcrawler
View on GitHub
crawler, chrome extension
☆21Jan 5, 2026Updated 6 months ago
DormyMo / SpiderKeeper
View on GitHub
admin ui for scrapy/open source scrapinghub
☆2,768May 4, 2023Updated 3 years ago
scrapy-plugins / scrapy-pagestorage
View on GitHub
A scrapy extension to store requests and responses information in storage service
☆27Mar 11, 2022Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
yijingping / unicrawler
View on GitHub
一个通用的可配置的爬虫框架
☆543Feb 9, 2023Updated 3 years ago
littlecodersh / ItChat
View on GitHub
A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信，三十行即可自定义个人号机器人。
☆26,472Sep 28, 2023Updated 2 years ago
ii0 / wechat-spider-1
View on GitHub
微信手机客户端爬虫，爬取公众号所有文章、阅读量、点赞量和评论内容
☆11Nov 11, 2018Updated 7 years ago
chenqx / spiderDemo
View on GitHub
☆23Jan 31, 2015Updated 11 years ago
miyakogi / pyppeteer
View on GitHub
Headless chrome/chromium automation library (unofficial port of puppeteer)
☆3,552Aug 5, 2021Updated 4 years ago
Emptyset110 / dHydra
View on GitHub
主要针对多数据源多策略实时计算的量化分析开发框架。提供新浪Level2等数据获取
☆492Jan 14, 2017Updated 9 years ago
luopeixiong / tyc_ttl
View on GitHub
☆13Jul 12, 2018Updated 7 years ago