killets / Distributed-Web-Crawler-with-Celery
Python: selenium, beautifulsoup2, celery, rabbitmq, Amazon AWS(EC2, S3)
☆10Updated 8 years ago
Related projects: ⓘ
- Simple Web UI for Scrapy spider management via Scrapyd☆49Updated 6 years ago
- Scraping tweets quickly using celery, RabbitMQ and Docker cluster☆48Updated last year
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40Updated 4 months ago
- ☆29Updated 3 years ago
- Python scripts for scraping bus ticket data from the websites of BoltBus, Greyhound, Megabus, GoBus, Amtrak, Peterpan, and EasternTravel.☆39Updated 3 years ago
- MongoDB extensions for Scrapy☆44Updated 9 years ago
- Fast Python Bloom Filter using Mmap☆13Updated 12 years ago
- docker scrapyd scrapy boot2docker crawler - a spider Python application that can be "Dockerized".☆42Updated 9 years ago
- Python script for rotation through Proxy Servers☆30Updated 6 years ago
- Scrape the Google search result with Scrapy.☆97Updated 4 years ago
- A scrapy pipeline which send items to Elastic Search server☆98Updated 6 years ago
- Restrict crawl and scraping scope using matchers.☆25Updated 8 years ago
- Python, Tor, Stem, Privoxy: with this tools, allow requests new connections via Tor for obtain new IP addresses.☆24Updated 5 years ago
- A simple python tool that generates a requests/bs4 based web scraper☆26Updated 2 years ago
- Creates a pipeline Airflow and Scrapy to output an average image composition of everyone's face in a given website☆42Updated 6 years ago
- Scrapy extension which writes crawled items to Kafka☆30Updated 5 years ago
- Library designed to replace the SQLite backend by a MongoDB backend on Scrapy queue management☆17Updated 7 years ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆90Updated 2 years ago
- Paginating the web☆37Updated 10 years ago
- A Scrapy crawler for http://books.toscrape.com☆26Updated 7 years ago
- Resize image on the fly using flask, zappa, pillow, opencv-python☆18Updated 7 years ago
- ☆35Updated this week
- Zyte Automatic Extraction integration for Scrapy☆55Updated 2 years ago
- A simple AliExpress spider to crawl all products with Scrapy.☆17Updated 6 years ago
- A project to attempt to automatically login to a website given a single seed☆11Updated 3 months ago
- Extract Social Profiles using Email Addresses (Python)☆14Updated 6 years ago
- A web scraper in Python using Django and Celery☆17Updated 11 years ago
- Scrapy spider middleware to clean up query parameters in request URLs☆25Updated 8 years ago
- boilerplate code to start with celery and rabbitmq in docker cluster☆19Updated last year
- Django Boilerplate Template for SaaS applications☆45Updated 5 months ago