scrapinghub/frontera

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrapinghub/frontera)

scrapinghub / frontera

A scalable frontier for web crawlers

☆1,332

Alternatives and similar repositories for frontera

Users that are interested in frontera are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

istresearch / scrapy-cluster
View on GitHub
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
☆1,225Nov 7, 2023Updated 2 years ago
scrapinghub / scrapyrt
View on GitHub
HTTP API for Scrapy spiders
☆882Jun 29, 2026Updated 3 weeks ago
rmax / scrapy-redis
View on GitHub
Redis-based components for Scrapy.
☆5,645May 19, 2026Updated 2 months ago
scrapinghub / scrapy-frontera
View on GitHub
More flexible and featured Frontera scheduler for Scrapy
☆36Jun 6, 2025Updated last year
scrapinghub / splash
View on GitHub
Lightweight, scriptable browser as a service with an HTTP API
☆4,190Aug 2, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
scrapinghub / portia
View on GitHub
Visual scraping for Scrapy
☆9,506Jun 26, 2024Updated 2 years ago
xsren / frontera-docs-zh_CN
View on GitHub
frontera的中文翻译文档
☆35Mar 10, 2018Updated 8 years ago
scrapy-plugins / scrapy-splash
View on GitHub
Scrapy+Splash for JavaScript integration
☆3,229Feb 11, 2025Updated last year
scrapy / scrapyd
View on GitHub
A service daemon to run Scrapy spiders
☆3,097Updated this week
DormyMo / SpiderKeeper
View on GitHub
admin ui for scrapy/open source scrapinghub
☆2,768May 4, 2023Updated 3 years ago
scrapinghub / webstruct
View on GitHub
NER toolkit for HTML data
☆259May 3, 2024Updated 2 years ago
scrapy / scrapely
View on GitHub
A pure-python HTML screen-scraping library
☆1,884Apr 4, 2022Updated 4 years ago
scrapinghub / spidermon
View on GitHub
Scrapy Extension for monitoring spiders execution.
☆561May 28, 2026Updated last month
Gerapy / Gerapy
View on GitHub
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
☆3,503Jul 4, 2026Updated 2 weeks ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
scrapinghub / extruct
View on GitHub
Extract embedded metadata from HTML markup
☆966Apr 1, 2026Updated 3 months ago
scrapy / scrapyd-client
View on GitHub
Command line client for Scrapyd server
☆772Feb 27, 2026Updated 4 months ago
scrapinghub / aduana
View on GitHub
Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…
☆54May 21, 2024Updated 2 years ago
TeamHG-Memex / arachnado
View on GitHub
Web Crawling UI and HTTP API, based on Scrapy and Tornado
☆162Apr 8, 2026Updated 3 months ago
scrapinghub / dateparser
View on GitHub
python parser for human readable dates
☆2,843Updated this week
my8100 / scrapydweb
View on GitHub
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI.…
☆3,409Feb 19, 2025Updated last year
apache / stormcrawler
View on GitHub
A scalable, mature and versatile web crawler based on Apache Storm
☆986Updated this week
aivarsk / scrapy-proxies
View on GitHub
Random proxy middleware for Scrapy
☆1,669Oct 1, 2019Updated 6 years ago
scrapinghub / page_finder
View on GitHub
Find which links on a web page are pagination links
☆29Jan 12, 2017Updated 9 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
scrapinghub / python-scrapinghub
View on GitHub
A client interface for Scrapinghub's API
☆206Jul 14, 2026Updated last week
scrapoxy / scrapoxy
View on GitHub
Scrapoxy has been discontinued.
☆2,413Feb 7, 2026Updated 5 months ago
TeamHG-Memex / autologin
View on GitHub
A project to attempt to automatically login to a website given a single seed
☆129Apr 8, 2026Updated 3 months ago
holgerd77 / django-dynamic-scraper
View on GitHub
Creating Scrapy scrapers via the Django admin interface
☆1,158Feb 19, 2022Updated 4 years ago
scrapy-plugins / scrapy-jsonrpc
View on GitHub
Scrapy extension to control spiders using JSON-RPC
☆299Aug 26, 2019Updated 6 years ago
dfdeshom / scrapy-kafka
View on GitHub
Kafka-based components for Scrapy
☆78Apr 10, 2018Updated 8 years ago
TeamHG-Memex / scrapy-dockerhub
View on GitHub
[UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.
☆12Apr 8, 2026Updated 3 months ago
scrapinghub / shub
View on GitHub
Scrapinghub Command Line Client
☆129Updated this week
binux / pyspider
View on GitHub
A Powerful Spider(Web Crawler) System in Python.
☆16,797Apr 30, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
scrapinghub / scmongo
View on GitHub
MongoDB extensions for Scrapy
☆44Oct 2, 2014Updated 11 years ago
TeamHG-Memex / scrapy-kafka-export
View on GitHub
Scrapy extension which writes crawled items to Kafka
☆31Apr 8, 2026Updated 3 months ago
BruceDone / awesome-crawler
View on GitHub
A collection of awesome web crawler,spider in different languages
☆7,257Jun 16, 2024Updated 2 years ago
scrapy / scrapy
View on GitHub
Scrapy, a fast high-level web crawling & scraping framework for Python.
☆63,273Updated this week
TeamHG-Memex / Formasaurus
View on GitHub
Formasaurus tells you the type of an HTML form and its fields using machine learning
☆121Apr 8, 2026Updated 3 months ago
gnemoug / distribute_crawler
View on GitHub
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
☆3,243Apr 18, 2017Updated 9 years ago
scrapy / scrapy-bench
View on GitHub
A CLI for benchmarking Scrapy.
☆32Jun 28, 2025Updated last year