atupal / ccrawler
A distrubuted crawler ues celery.
☆17Updated 10 years ago
Alternatives and similar repositories for ccrawler:
Users that are interested in ccrawler are comparing it to the libraries listed below
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40Updated 11 months ago
- Search engine base (crawler, indexer and parser) using Python, Celery, RabbitMQ, CouchDB and Whoosh.☆11Updated last year
- Toy web crawler☆21Updated 13 years ago
- Tool to flatten stream of JSON-like objects, configured via schema☆33Updated 5 years ago
- Fast Python Bloom Filter using Mmap☆13Updated 12 years ago
- Tornado Web Crawler☆66Updated 12 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Recommendations Serving Engine using python☆28Updated 9 years ago
- Scrapy middleware for the autologin☆37Updated 6 years ago
- Crawlera tools☆26Updated 9 years ago
- Turn your IPython console into a cross-database SQL client☆31Updated 8 years ago
- python library for interacting with SolrCloud☆36Updated 4 years ago
- Scan for missing timeout calls in python source files☆18Updated 5 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 3 years ago
- A middleware to use random user agent in Scrapy crawler.☆33Updated 12 years ago
- Python SMTP client and Email for Humans™☆82Updated 6 years ago
- Simple Web UI for Scrapy spider management via Scrapyd☆51Updated 6 years ago
- [not actively maintained] The C++ webkit-server from capybara-webkit with useful extensions and Python bindings☆48Updated 4 years ago
- Integrates terminado (a web based terminal) with flask☆15Updated 7 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆46Updated 7 years ago
- Cubes OLAP Examples☆74Updated 6 years ago
- Application Driven Stats Monitoring☆230Updated 9 years ago
- Supervisor On/Off: an alternative web interface for supervisor☆51Updated 8 years ago
- Coursera materials downloader.☆58Updated 2 years ago
- A simple proxy web service in 19 lines of Python code.☆23Updated 10 years ago
- Gevent Crawling in Python, with Utilities☆23Updated 10 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆205Updated 11 years ago
- A crawler, indexer, and query interface all in Python with distributed processing via Pyro4.☆23Updated 13 years ago
- Adding Social Authentication to Django☆19Updated 10 years ago