TeamHG-Memex/scrapy-dockerhub

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TeamHG-Memex/scrapy-dockerhub)

TeamHG-Memex / scrapy-dockerhub

[UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.

☆12

Alternatives and similar repositories for scrapy-dockerhub

Users that are interested in scrapy-dockerhub are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mitll / MITIE
View on GitHub
MITIE: library and tools for information extraction
☆29Jan 22, 2015Updated 11 years ago
TeamHG-Memex / Formasaurus
View on GitHub
Formasaurus tells you the type of an HTML form and its fields using machine learning
☆121Apr 8, 2026Updated 3 months ago
mitll / topic-clustering
View on GitHub
☆44Jan 15, 2016Updated 10 years ago
NextCenturyCorporation / dig
View on GitHub
Faceted search engine for domain-specific exploration of the Web
☆45Feb 10, 2017Updated 9 years ago
ericwhyne / open-catalog-generator
View on GitHub
Code and templates required to build the DARPA open catalog.
☆18Mar 23, 2016Updated 10 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
TeamHG-Memex / docker-tor-rotator
View on GitHub
A rotating socks proxy using Tor, Delegate and Haproxy
☆14Apr 8, 2026Updated 3 months ago
pymonger / facetview-memex
View on GitHub
Facet Search interface for MEMEX.
☆13Feb 26, 2015Updated 11 years ago
TeamHG-Memex / scrapy-kafka-export
View on GitHub
Scrapy extension which writes crawled items to Kafka
☆31Apr 8, 2026Updated 3 months ago
autonlab / tad
View on GitHub
Temporal Anomaly Detector (TAD)
☆16Nov 2, 2017Updated 8 years ago
TeamHG-Memex / arachnado
View on GitHub
Web Crawling UI and HTTP API, based on Scrapy and Tornado
☆162Apr 8, 2026Updated 3 months ago
scrapinghub / aduana
View on GitHub
Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…
☆54May 21, 2024Updated 2 years ago
TeamHG-Memex / MaybeDont
View on GitHub
A component that tries to avoid downloading duplicate content
☆28Apr 8, 2026Updated 3 months ago
Sotera / Datawake
View on GitHub
Browser add-on and web server to support collection and analysis of web browsing data.
☆14Mar 9, 2016Updated 10 years ago
TeamHG-Memex / sitehound-frontend
View on GitHub
Site Hound (previously THH) is a Domain Discovery Tool
☆24Apr 8, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
chrismattmann / imagecat
View on GitHub
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…
☆96Aug 26, 2018Updated 7 years ago
TeamHG-Memex / extract-html-diff
View on GitHub
extract difference between two html pages
☆33Apr 8, 2026Updated 3 months ago
TeamHG-Memex / url-summary
View on GitHub
Show summary of a large number of URLs in a Jupyter Notebook
☆19Apr 8, 2026Updated 3 months ago
TransparencyToolkit / dataspec-sii
View on GitHub
Dataspec for SII
☆10Jan 4, 2017Updated 9 years ago
TeamHG-Memex / tor-proxy
View on GitHub
a tor socks proxy docker image
☆12Apr 8, 2026Updated 3 months ago
jaegeral / awesome-cyber-civil-society-actors
View on GitHub
A curated lust of awesome cyber civil society actors, project etc.
☆10Jul 16, 2020Updated 6 years ago
nasa-jpl-memex / image_space
View on GitHub
Interactive Image similarity and Visual Search and Retrieval application
☆95Apr 16, 2024Updated 2 years ago
scrapinghub / python-hubstorage
View on GitHub
Deprecated HubStorage client library - please use python-scrapinghub>=1.9.0 instead
☆16Dec 5, 2016Updated 9 years ago
istresearch / scrapy-cluster
View on GitHub
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
☆1,226Nov 7, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
dossier / html-highlighter
View on GitHub
Highlight and select phrases in HTML pages.
☆24Nov 4, 2019Updated 6 years ago
ContinuumIO / scrapy_scrapers
View on GitHub
Scraper built with Scrapy.
☆18Jun 25, 2026Updated last month
hemslo / poky-engine
View on GitHub
A simple search engine in python using Tornado, Scrapy, Redis and MongoDB
☆24Jun 21, 2013Updated 13 years ago
datamicroscopes / irm
View on GitHub
Infinite relational model (IRM) for datamicroscopes
☆14Oct 26, 2015Updated 10 years ago
arrayfire / arrayfire-lua
View on GitHub
Lua wrapper for ArrayFire
☆10Feb 27, 2017Updated 9 years ago
rmax / scrapydo
View on GitHub
Crochet-based blocking API for Scrapy.
☆47Feb 24, 2017Updated 9 years ago
mitll / vizlinc
View on GitHub
Vizlinc
☆15Jan 14, 2016Updated 10 years ago
NextCenturyCorporation / neon-gtd
View on GitHub
Neon Geo-temporal Dashboard
☆14Jan 10, 2020Updated 6 years ago
arrayfire / clFFT
View on GitHub
a software library containing FFT functions written in OpenCL
☆12Dec 12, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
datamicroscopes / mixturemodel
View on GitHub
Dirichlet process mixture model (DPMM) for datamicroscopes
☆14Oct 9, 2015Updated 10 years ago
smallk / smallk.github.io
View on GitHub
SmallK: very fast data clustering tools
☆13Apr 3, 2019Updated 7 years ago
AKSW / openQA
View on GitHub
☆11Jun 26, 2023Updated 3 years ago
sendgridlabs / cookiecutter-flaskrestful
View on GitHub
Python Flask-RESTful template for cookiecutter
☆11Mar 31, 2016Updated 10 years ago
torps / torps
View on GitHub
The Tor Path Simulator
☆86Jan 16, 2017Updated 9 years ago
bkj / wit
View on GitHub
Algorithms for "schema matching"
☆26Jul 6, 2016Updated 10 years ago
fygrave / ndf
View on GitHub
Network Defender Toolkit
☆18Jun 11, 2013Updated 13 years ago