HyperionGray / starbelly
Streaming web crawler with WebSocket API
☆44Updated last year
Related projects ⓘ
Alternatives and complementary repositories for starbelly
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- https://mimesniff.spec.whatwg.org/ implementation for Python☆14Updated 10 months ago
- A generic crawler☆78Updated 6 years ago
- Broad crawler for domain discovery☆19Updated 6 years ago
- Automates the process of repeatedly searching for a website via scraped proxy IP and search keywords☆42Updated last year
- Scrapy middleware which allows to crawl only new content☆79Updated 2 years ago
- A collaborative platform for creating, editing and sharing JSON objects.☆75Updated 3 weeks ago
- Graphistry admin docs: launch, configure, use, & debug☆23Updated last week
- DomainClassifier is a Python (2/3) library to extract and classify Internet domains/hostnames/IP addresses from raw unstructured text fil…☆78Updated 9 months ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆116Updated 5 months ago
- List of Sanctions and Most wanted☆26Updated 7 years ago
- Scrapy middleware for the autologin☆37Updated 6 years ago
- Python library for modern thread / multiprocessing pooling and task processing via asyncio☆15Updated 3 years ago
- A simple DuckDuckGo URL scraper.☆23Updated 9 months ago
- Gather information on Wiki contributions from IP ranges☆24Updated 6 years ago
- Processes data from images which are tagged with the specified Instagram tag.☆13Updated 10 years ago
- A project to attempt to automatically login to a website given a single seed☆123Updated 2 years ago
- A helper library full of URL-related heuristics.☆64Updated last month
- A rotating socks proxy using Tor, Delegate and Haproxy☆14Updated 4 years ago
- A micro-framework for asynchronous deep crawls and web scraping with Python☆13Updated last year
- Extract social media links and account names from websites.☆37Updated 4 years ago
- Analyze scraped data☆47Updated 4 years ago
- Async dnsbl spam lists checker based on asyncio/aiodns.☆51Updated 2 months ago
- Homoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.☆79Updated 3 years ago