Scrapy project to scrape public web directories (educational) [DEPRECATED]
☆1,627Oct 27, 2017Updated 8 years ago
Alternatives and similar repositories for dirbot
Users that are interested in dirbot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.☆3,257Nov 3, 2023Updated 2 years ago
- This is a sample Scrapy project for educational purposes☆1,357Nov 29, 2023Updated 2 years ago
- scrapy中文翻译文档☆1,104Sep 12, 2019Updated 6 years ago
- 使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现☆3,243Apr 18, 2017Updated 9 years ago
- Scrapy, a fast high-level web crawling & scraping framework for Python.☆62,120Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Redis-based components for Scrapy.☆5,637May 19, 2026Updated 3 weeks ago
- A Powerful Spider(Web Crawler) System in Python.☆16,811Apr 30, 2024Updated 2 years ago
- scrapy examples for crawling zhihu and github☆222Jan 11, 2023Updated 3 years ago
- ☆94Apr 28, 2014Updated 12 years ago
- Scrapy extension to control spiders using JSON-RPC☆299Aug 26, 2019Updated 6 years ago
- WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.☆155Jun 3, 2026Updated last week
- This repository store some example to learn scrapy better☆176Oct 9, 2020Updated 5 years ago
- Visual scraping for Scrapy☆9,505Jun 26, 2024Updated last year
- A dynamic configurable news crawler based Scrapy☆164Jul 24, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 用scrapy采集cnblogs列表页爬虫☆274Jun 16, 2015Updated 10 years ago
- A service daemon to run Scrapy spiders☆3,092Updated this week
- 获取知乎内容信息,包括问题,答案,用户,收藏夹信息☆2,329Feb 8, 2022Updated 4 years ago
- Scrapy+Splash for JavaScript integration☆3,231Feb 11, 2025Updated last year
- ☆23Jan 31, 2015Updated 11 years ago
- A high-level distributed crawling framework.☆1,501Jul 31, 2022Updated 3 years ago
- Scrapy examples crawling Craigslist☆199Apr 20, 2016Updated 10 years ago
- MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the item…☆358Apr 6, 2021Updated 5 years ago
- Html Content / Article Extractor, web scrapping lib in Python☆4,088Mar 10, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 微信公众平台 Python 开发包 [DEPRECATED]☆1,348Oct 1, 2020Updated 5 years ago
- ☆167Nov 3, 2018Updated 7 years ago
- 结巴中文分词☆35,002Aug 21, 2024Updated last year
- A simple, yet elegant, HTTP library.☆54,038Updated this week
- A web spider for zhihu.com☆722Jan 17, 2024Updated 2 years ago
- Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.☆22,181Updated this week
- Fill HTML login forms automatically☆279Apr 24, 2024Updated 2 years ago
- 新浪微博爬虫(Scrapy、Redis)☆3,281Sep 5, 2018Updated 7 years ago
- Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)☆33Feb 22, 2018Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The Python micro framework for building web applications.☆71,647May 31, 2026Updated last week
- A middleware for scrapy. Used to change HTTP proxy from time to time.☆323Feb 1, 2018Updated 8 years ago
- 用scrapy写的京东 爬虫☆451Dec 5, 2014Updated 11 years ago
- A pure-python HTML screen-scraping library☆1,888Apr 4, 2022Updated 4 years ago
- Random proxy middleware for Scrapy☆1,668Oct 1, 2019Updated 6 years ago
- 模拟登录一些知名的网站,为了方便爬取需要登录的网站☆5,873Jun 8, 2018Updated 8 years ago
- A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信,三十行即可自定义个人号机器人。☆26,462Sep 28, 2023Updated 2 years ago