simon987 / Architeuthis
MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.
☆41Updated 5 years ago
Alternatives and similar repositories for Architeuthis:
Users that are interested in Architeuthis are comparing it to the libraries listed below
- 💾 Easily manage access to your open directory through OAuth2☆78Updated 2 years ago
- Distributed crawler, database and web frontend for public directories indexing☆139Updated 5 years ago
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆88Updated 4 years ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆51Updated 6 years ago
- 💾 YouTube video metadata archiver written in Golang☆19Updated 5 years ago
- A server to collect & archive websites that also supports video downloads☆86Updated 2 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆39Updated 2 weeks ago
- Easy to use rclone mount/unmount scripts☆10Updated 5 years ago
- Streaming web crawler with WebSocket API☆44Updated last year
- Tool for real-time scraping of news articles.☆39Updated 5 years ago
- web-based epub indexer☆88Updated 10 months ago
- The Temboz RSS/Atom feed reader☆83Updated last year
- 🧠 AI powered image tagger backed by DeepDetect☆245Updated 6 years ago
- oldweb.today Remote/Containerized Browser System☆10Updated 6 years ago
- Scrapy middleware which allows to crawl only new content☆80Updated 2 years ago
- A DHT crawler and torrent indexer☆110Updated 6 years ago
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆27Updated 3 years ago
- Configurable Scraper & Downloader, Powered by RegExp and Go☆64Updated last year
- Media Processing Pipeline☆69Updated 5 years ago
- Zood Location API server☆30Updated 5 months ago
- A UDP torrent tracker scraper library written in Python 3☆54Updated 2 years ago
- Subtitle Download Service☆16Updated 2 years ago
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆171Updated 4 years ago
- Deterministic Usenet Vault☆31Updated 5 years ago
- Configure, launch, and work in Dockerized environments☆32Updated 4 years ago
- OD-Database Go crawler☆26Updated 5 years ago
- High-Performance BitTorrent Tracker written in C++☆73Updated 4 years ago
- [mirror] Standalone DHT search☆58Updated 2 years ago
- A tiny HTTP server that can serve files out of any rclone remote.☆39Updated 8 years ago
- URLTeam's second generation of URL shortener archiving tools☆75Updated last month