Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.
☆438Dec 30, 2022Updated 3 years ago
Alternatives and similar repositories for Crawling-Infrastructure
Users that are interested in Crawling-Infrastructure are comparing it to the libraries listed below
Sorting:
- Javascript scraping module based on puppeteer for many different search engines...☆568Dec 30, 2022Updated 3 years ago
- A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.☆2,796Jul 3, 2021Updated 4 years ago
- 🧱 A uniform template to use as a foundation for Puppeteer bot construction.☆69May 6, 2021Updated 4 years ago
- ☆12May 7, 2023Updated 2 years ago
- Scrapoxy has been discontinued.☆2,432Feb 7, 2026Updated 3 weeks ago
- 📡 Renew the IP address of a tethered Android device via Node asynchronously.☆75Aug 3, 2023Updated 2 years ago
- 💯 Teach puppeteer new tricks through plugins.☆7,263Jul 18, 2024Updated last year
- The web scraper that's nearly impossible to block - now called @ulixee/hero☆729Mar 7, 2023Updated 3 years ago
- Event Data Collector☆39Jan 12, 2026Updated last month
- ☆20Apr 21, 2020Updated 5 years ago
- Assorted, MIT licensed, threat hunting rules from @bradleyjkemp☆14Mar 11, 2022Updated 3 years ago
- The BlogDB Webservice☆12Feb 1, 2022Updated 4 years ago
- Docker kinsing malware bitcoin/xmr miner☆23Feb 18, 2021Updated 5 years ago
- Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprint…☆4,962Jul 17, 2024Updated last year
- Solution to stop sites from fingerprinting your puppeteer☆130Apr 21, 2024Updated last year
- List of free and checked http, https, socks4 and socks5 proxies☆17Updated this week
- Swift code to parse the quarantine history database, Chrome history database, Safari history database, and Firefox history database on ma…☆15Dec 3, 2020Updated 5 years ago
- Microsoft Applocker evasion tool☆39Nov 26, 2019Updated 6 years ago
- Passive TCP/IP Fingerprinting Tool. Run this on your server and find out what Operating Systems your clients are *really* using.☆409Nov 15, 2025Updated 3 months ago
- Automated WireGuard Deployment on Azure☆46Feb 28, 2021Updated 5 years ago
- Distributed crawler powered by Headless Chrome☆5,710Apr 29, 2023Updated 2 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Feb 10, 2026Updated 3 weeks ago
- ☆177Dec 30, 2022Updated 3 years ago
- A streamlined tool for decoding and simplifying JavaScript obfuscated by Datadome's Interstitial challenge, enhancing readability and mai…☆32Jan 12, 2024Updated 2 years ago
- A Node.js module for looking up running processes☆16Sep 26, 2025Updated 5 months ago
- Puppeteer Pool, run a cluster of instances in parallel☆3,513Updated this week
- Search google, bing, yahoo, and other search engines with python☆670Apr 2, 2025Updated 11 months ago
- Web application/technology detection tool☆211Sep 1, 2023Updated 2 years ago
- A test suite of common scraper detection techniques. See how detectable your scraper stack is.☆141Oct 31, 2022Updated 3 years ago
- Chromium Binary for AWS Lambda and Google Cloud Functions☆3,292Sep 3, 2024Updated last year
- Command line programs to save Google documents to text and LaTeX files☆19Oct 2, 2020Updated 5 years ago
- PoC for detecting and evading ETW detection of .Net Assembly.Load☆21Aug 26, 2020Updated 5 years ago
- Just like on ScraperWiki Classic; now a part of QuickCode.☆38Aug 12, 2016Updated 9 years ago
- ☆116Mar 16, 2024Updated last year
- BloodHound Cypher Queries Ported to a Jupyter Notebook☆53Jun 20, 2020Updated 5 years ago
- ☆53Oct 20, 2020Updated 5 years ago
- donLoader is a shellcode loader creation tool that uses donut to convert executable payloads into shellcode to evade detection on disk.☆20Nov 24, 2021Updated 4 years ago
- Dead simple C# project to take a screenshot.☆19Jan 14, 2019Updated 7 years ago
- Nodejs lib to parse Google SERP html pages☆46Jul 27, 2023Updated 2 years ago