HyperionGray / starbelly
Streaming web crawler with WebSocket API
☆44Updated last year
Alternatives and similar repositories for starbelly:
Users that are interested in starbelly are comparing it to the libraries listed below
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- Scrapy middleware which allows to crawl only new content☆80Updated 2 years ago
- Automates the process of repeatedly searching for a website via scraped proxy IP and search keywords☆44Updated last year
- https://mimesniff.spec.whatwg.org/ implementation for Python☆14Updated last year
- ☆15Updated 6 years ago
- A collaborative platform for creating, editing and sharing JSON objects.☆73Updated 2 months ago
- A generic crawler☆78Updated 6 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40Updated 9 months ago
- Scrapy middleware for the autologin☆37Updated 6 years ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆118Updated 8 months ago
- List of Sanctions and Most wanted☆26Updated 7 years ago
- A rotating socks proxy using Tor, Delegate and Haproxy☆14Updated 5 years ago
- Pluggable DSL that uses pipes to perform a series of linear transformations to extract data☆15Updated 8 months ago
- ☆29Updated 3 years ago
- This is the facade for installation and access to the individual components☆15Updated 6 years ago
- Scrapy python crawler/spider with post/get login (handles CSRF), variable level of recursions and optionally save to disk☆55Updated 6 years ago
- Extract social media links and account names from websites.☆38Updated 4 years ago
- Zyte Automatic Extraction integration for Scrapy☆56Updated 3 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 3 years ago
- scrapin' proxies with ocr☆19Updated 6 years ago
- Broad crawler for domain discovery☆19Updated 6 years ago
- A project to attempt to automatically login to a website given a single seed☆123Updated 2 years ago
- Python API for parsehub.com web scraping service☆45Updated 6 years ago
- ☆14Updated 6 years ago
- DomainClassifier is a Python (2/3) library to extract and classify Internet domains/hostnames/IP addresses from raw unstructured text fil…☆77Updated last year
- ProxyCrawl Python library for scraping and crawling☆59Updated last year
- detectem - detect software and its version on websites.☆155Updated 3 years ago
- Collect email addresses by crawling search engine results.☆29Updated 2 years ago
- ☕🗄CAching Proxy in Python – Simple file based python http proxy☆15Updated 3 years ago
- Notebook collection☆10Updated 5 years ago