scrapinghub/portia

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrapinghub/portia)

scrapinghub / portia

Visual scraping for Scrapy

☆9,505

Alternatives and similar repositories for portia

Users that are interested in portia are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

scrapy / scrapely
View on GitHub
A pure-python HTML screen-scraping library
☆1,884Apr 4, 2022Updated 4 years ago
binux / pyspider
View on GitHub
A Powerful Spider(Web Crawler) System in Python.
☆16,797Apr 30, 2024Updated 2 years ago
scrapinghub / splash
View on GitHub
Lightweight, scriptable browser as a service with an HTTP API
☆4,190Aug 2, 2024Updated last year
scrapy / scrapy
View on GitHub
Scrapy, a fast high-level web crawling & scraping framework for Python.
☆63,387Updated this week
scrapy / scrapyd
View on GitHub
A service daemon to run Scrapy spiders
☆3,097Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
scrapinghub / frontera
View on GitHub
A scalable frontier for web crawlers
☆1,332Jun 6, 2025Updated last year
DormyMo / SpiderKeeper
View on GitHub
admin ui for scrapy/open source scrapinghub
☆2,768May 4, 2023Updated 3 years ago
rmax / scrapy-redis
View on GitHub
Redis-based components for Scrapy.
☆5,644May 19, 2026Updated 2 months ago
scrapy-plugins / scrapy-splash
View on GitHub
Scrapy+Splash for JavaScript integration
☆3,229Feb 11, 2025Updated last year
geekan / scrapy-examples
View on GitHub
Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.
☆3,252Nov 3, 2023Updated 2 years ago
Gerapy / Gerapy
View on GitHub
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
☆3,503Jul 4, 2026Updated 2 weeks ago
lorien / grab
View on GitHub
Web Scraping Framework
☆2,461Sep 19, 2025Updated 10 months ago
istresearch / scrapy-cluster
View on GitHub
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
☆1,226Nov 7, 2023Updated 2 years ago
huginn / huginn
View on GitHub
Create agents that monitor and act on your behalf. Your agents are standing by!
☆49,678Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
my8100 / scrapydweb
View on GitHub
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI.…
☆3,411Feb 19, 2025Updated last year
clips / pattern
View on GitHub
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
☆8,857Jun 10, 2024Updated 2 years ago
scrapinghub / scrapyrt
View on GitHub
HTTP API for Scrapy spiders
☆882Jun 29, 2026Updated 3 weeks ago
codelucas / newspaper
View on GitHub
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
☆15,121Updated this week
qinxuye / cola
View on GitHub
A high-level distributed crawling framework.
☆1,500Jul 31, 2022Updated 3 years ago
gnemoug / distribute_crawler
View on GitHub
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
☆3,243Apr 18, 2017Updated 9 years ago
getredash / redash
View on GitHub
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
☆28,715Jul 9, 2026Updated 2 weeks ago
cayleygraph / cayley
View on GitHub
An open-source graph database
☆15,049Updated this week
ariya / phantomjs
View on GitHub
Scriptable Headless Browser
☆29,461Nov 26, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
apache / superset
View on GitHub
Apache Superset is a Data Visualization and Data Exploration Platform
☆73,960Updated this week
BruceDone / awesome-crawler
View on GitHub
A collection of awesome web crawler,spider in different languages
☆7,257Jun 16, 2024Updated 2 years ago
crawlab-team / crawlab
View on GitHub
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架
☆12,250Feb 10, 2026Updated 5 months ago
deanmalmgren / textract
View on GitHub
extract text from any document. no muss. no fuss.
☆4,674Jul 11, 2026Updated 2 weeks ago
aivarsk / scrapy-proxies
View on GitHub
Random proxy middleware for Scrapy
☆1,669Oct 1, 2019Updated 6 years ago
grangier / python-goose
View on GitHub
Html Content / Article Extractor, web scrapping lib in Python
☆4,101Mar 10, 2026Updated 4 months ago
jeanphix / Ghost.py
View on GitHub
Webkit based scriptable web browser for python.
☆2,755Feb 24, 2024Updated 2 years ago
ruipgil / scraperjs
View on GitHub
A complete and versatile web scraper.
☆3,716Oct 18, 2020Updated 5 years ago
apache / predictionio
View on GitHub
PredictionIO, a machine learning server for developers and ML engineers.
☆12,522Jan 9, 2021Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
holgerd77 / django-dynamic-scraper
View on GitHub
Creating Scrapy scrapers via the Django admin interface
☆1,158Feb 19, 2022Updated 4 years ago
jmcarp / robobrowser
View on GitHub
☆3,695Sep 10, 2020Updated 5 years ago
psf / requests-html
View on GitHub
Pythonic HTML Parsing for Humans™
☆13,826Apr 16, 2024Updated 2 years ago
deis / deis
View on GitHub
Deis v1, the CoreOS and Docker PaaS: Your PaaS. Your Rules.
☆5,997May 5, 2019Updated 7 years ago
scrapy / slybot
View on GitHub
☆224Apr 27, 2015Updated 11 years ago
lorien / awesome-web-scraping
View on GitHub
List of libraries, tools and APIs for web scraping and data processing.
☆7,985Jul 12, 2026Updated last week
sanic-org / sanic
View on GitHub
Accelerate your web app development | Build fast. Run fast.
☆18,639Updated this week