HyperionGray / starbellyLinks
Streaming web crawler with WebSocket API
☆44Updated last year
Alternatives and similar repositories for starbelly
Users that are interested in starbelly are comparing it to the libraries listed below
Sorting:
- A component that tries to avoid downloading duplicate content☆27Updated 7 years ago
- https://mimesniff.spec.whatwg.org/ implementation for Python☆13Updated last year
- List of Sanctions and Most wanted☆28Updated 7 years ago
- Processes data from images which are tagged with the specified Instagram tag.☆13Updated 11 years ago
- A generic crawler☆78Updated 7 years ago
- Pluggable DSL that uses pipes to perform a series of linear transformations to extract data☆16Updated 10 months ago
- Scrapy middleware which allows to crawl only new content☆79Updated 2 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 7 months ago
- Extract social media links and account names from websites.☆38Updated 4 years ago
- Gather information on Wiki contributions from IP ranges☆24Updated 7 years ago
- A collaborative platform for creating, editing and sharing JSON objects.☆73Updated 4 months ago
- Get user ids from social network handlers☆12Updated 8 years ago
- Automates the process of repeatedly searching for a website via scraped proxy IP and search keywords☆45Updated last year
- A content inspecting SMTP proxy☆17Updated 10 years ago
- Source codes related to the articles about OSINT. Using social media APIs and Python language.☆22Updated 6 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40Updated last year
- playing around with food and drink sites and OSINT☆15Updated 6 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 3 years ago
- ProxyCrawl Python library for scraping and crawling☆59Updated last year
- DomainsProject.org HTTP worker☆23Updated 2 years ago
- Collect email addresses by crawling search engine results.☆29Updated 2 years ago
- ☆15Updated 6 years ago
- Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.☆63Updated 3 weeks ago
- Mass HTTP brute forcer to detect directories and interesting technologies☆10Updated 8 years ago
- DomainClassifier is a Python (2/3) library to extract and classify Internet domains/hostnames/IP addresses from raw unstructured text fil…☆77Updated last year
- Code release for: Cookies that give you away: The surveillance implications of web tracking☆53Updated 6 years ago
- Napkin is a simple tool to produce statistical analysis of a text☆12Updated last year
- Chrome extension to extract data from websites surfed inside of chrome☆18Updated 10 years ago
- Bot for operating snscrape in #archivebot on efnet☆10Updated 5 years ago
- extract difference between two html pages☆32Updated 7 years ago