Scrapy project to scrape public web directories (educational) [DEPRECATED]
☆1,626Oct 27, 2017Updated 8 years ago
Alternatives and similar repositories for dirbot
Users that are interested in dirbot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.☆3,261Nov 3, 2023Updated 2 years ago
- This is a sample Scrapy project for educational purposes☆1,355Nov 29, 2023Updated 2 years ago
- scrapy中文翻译文档☆1,104Sep 12, 2019Updated 6 years ago
- 使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现☆3,244Apr 18, 2017Updated 9 years ago
- Scrapy, a fast high-level web crawling & scraping framework for Python.☆61,496Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Redis-based components for Scrapy.☆5,632Apr 8, 2026Updated 3 weeks ago
- A Powerful Spider(Web Crawler) System in Python.☆16,844Apr 30, 2024Updated 2 years ago
- scrapy examples for crawling zhihu and github☆222Jan 11, 2023Updated 3 years ago
- ☆95Apr 28, 2014Updated 12 years ago
- Scrapy extension to control spiders using JSON-RPC☆300Aug 26, 2019Updated 6 years ago
- WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.☆155Jul 28, 2017Updated 8 years ago
- This repository store some example to learn scrapy better☆176Oct 9, 2020Updated 5 years ago
- Visual scraping for Scrapy☆9,491Jun 26, 2024Updated last year
- A dynamic configurable news crawler based Scrapy☆164Jul 24, 2017Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 用scrapy采集cnblogs列表页爬虫☆274Jun 16, 2015Updated 10 years ago
- A service daemon to run Scrapy spiders☆3,088Apr 8, 2026Updated 3 weeks ago
- 获取知乎内容信息,包括问题,答案,用户,收藏夹信息☆2,327Feb 8, 2022Updated 4 years ago
- Scrapy+Splash for JavaScript integration☆3,232Feb 11, 2025Updated last year
- ☆23Jan 31, 2015Updated 11 years ago
- A high-level distributed crawling framework.☆1,504Jul 31, 2022Updated 3 years ago
- Scrapy examples crawling Craigslist☆199Apr 20, 2016Updated 10 years ago
- MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the item…☆358Apr 6, 2021Updated 5 years ago
- Html Content / Article Extractor, web scrapping lib in Python☆4,079Mar 10, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 微信公众平台 Python 开发包 [DEPRECATED]☆1,349Oct 1, 2020Updated 5 years ago
- ☆167Nov 3, 2018Updated 7 years ago
- 结巴中文分词☆34,930Aug 21, 2024Updated last year
- A simple, yet elegant, HTTP library.☆53,924Apr 24, 2026Updated last week
- A web spider for zhihu.com☆724Jan 17, 2024Updated 2 years ago
- Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.☆22,222Mar 31, 2026Updated last month
- Fill HTML login forms automatically☆279Apr 24, 2024Updated 2 years ago
- 新浪微博爬虫(Scrapy、Redis)☆3,282Sep 5, 2018Updated 7 years ago
- The Python micro framework for building web applications.☆71,479Apr 9, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)☆33Feb 22, 2018Updated 8 years ago
- 用scrapy写的京东爬虫☆450Dec 5, 2014Updated 11 years ago
- A middleware for scrapy. Used to change HTTP proxy from time to time.☆323Feb 1, 2018Updated 8 years ago
- A pure-python HTML screen-scraping library☆1,887Apr 4, 2022Updated 4 years ago
- Random proxy middleware for Scrapy☆1,669Oct 1, 2019Updated 6 years ago
- 模拟登录一些知名的网站,为了方便爬取需要登录的网站☆5,879Jun 8, 2018Updated 7 years ago
- A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信,三十行即可自定义个人号机器人。☆26,500Sep 28, 2023Updated 2 years ago