scrapinghub/scrapy-autoextract

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrapinghub/scrapy-autoextract)

scrapinghub / scrapy-autoextract

Zyte Automatic Extraction integration for Scrapy

☆58

Alternatives and similar repositories for scrapy-autoextract

Users that are interested in scrapy-autoextract are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zytedata / zyte-autoextract
View on GitHub
Python clients for Zyte AutoExtract API
☆41Jan 17, 2022Updated 4 years ago
scrapy / itemloaders
View on GitHub
Library to populate items using XPath and CSS with a convenient API
☆49Updated this week
scrapinghub / shublang
View on GitHub
Pluggable DSL that uses pipes to perform a series of linear transformations to extract data
☆16Jul 9, 2024Updated 2 years ago
ejulio / spider-feeder
View on GitHub
A library to make it easier to load input URLs to start scrapy processes
☆14Feb 21, 2021Updated 5 years ago
scrapinghub / autoextract-spiders
View on GitHub
Pre-built Scrapy spiders for AutoExtract
☆19Apr 24, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
scrapinghub / arche
View on GitHub
Analyze scraped data
☆47Dec 9, 2019Updated 6 years ago
scrapinghub / spidermon
View on GitHub
Scrapy Extension for monitoring spiders execution.
☆562May 28, 2026Updated 2 months ago
joe-wojniak / PythonForFinance
View on GitHub
Examples inspired by book Python For Finance
☆12Jan 20, 2021Updated 5 years ago
scrapinghub / shub-workflow
View on GitHub
☆14Updated this week
zytedata / zyte-smartproxy-headless-proxy
View on GitHub
A complimentary proxy to help to use SPM with headless browsers
☆109May 20, 2026Updated 2 months ago
scrapinghub / web-poet
View on GitHub
Web scraping Page Objects core library
☆107Jul 10, 2026Updated 2 weeks ago
scrapinghub / scrapyrt
View on GitHub
HTTP API for Scrapy spiders
☆882Jun 29, 2026Updated last month
sajib1066 / django-boilerplate
View on GitHub
Boilerplate for any django projects with HTML, CSS, Bootstrap.
☆13Feb 13, 2026Updated 5 months ago
BurnzZ / uberfare
View on GitHub
Conveniently collects Uber fare prices from a given origin and destination.
☆12Jul 16, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
litestar-org / pydantic-openapi-schema
View on GitHub
Generate OpenAPI 3.x.x using Pydantic
☆11Feb 9, 2023Updated 3 years ago
scrapinghub / shub
View on GitHub
Scrapinghub Command Line Client
☆129Jul 22, 2026Updated last week
croqaz / awesome-scrapy
View on GitHub
🕶 Awesome list of Scrapy tools and libraries
☆60Jul 6, 2020Updated 6 years ago
victor-torres / sinesp-bot
View on GitHub
Automatizando consultas a veículos utilizando a base de dados do SINESP
☆29Jul 6, 2019Updated 7 years ago
composable-logs / composable-logs
View on GitHub
Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documen…
☆18May 23, 2023Updated 3 years ago
rmax / scrapy-inline-requests
View on GitHub
A decorator to write coroutine-like spider callbacks.
☆109Dec 26, 2022Updated 3 years ago
TeamHG-Memex / html-text
View on GitHub
Extract text from HTML
☆135Apr 8, 2026Updated 3 months ago
Python3WebSpider / ScrapyPyppeteer
View on GitHub
Scrapy Pyppeteer Demo
☆12Jul 30, 2020Updated 5 years ago
scrapinghub / frontera
View on GitHub
A scalable frontier for web crawlers
☆1,332Jun 6, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
akshaymiterani / QuoraScraper
View on GitHub
Web crawler for Quora using Python Selenium
☆11Jul 19, 2018Updated 8 years ago
gurteshwar / freeswitch-esl-python
View on GitHub
Auto generated swig python module with a binary compnent
☆11Apr 19, 2012Updated 14 years ago
IDSIA / novel2graph
View on GitHub
☆14Mar 30, 2023Updated 3 years ago
luizdepra / r8
View on GitHub
A simple CHIP8 interpreter made with Rust.
☆11Apr 23, 2026Updated 3 months ago
further-reading / scrapy-gui
View on GitHub
A simple, Qt-Webengine powered web browser with built in functionality for basic scrapy webscraping support.
☆109May 21, 2024Updated 2 years ago
histograph / api
View on GitHub
Histograph API
☆13Aug 24, 2020Updated 5 years ago
scrapy-plugins / scrapy-zyte-smartproxy
View on GitHub
Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
☆363May 4, 2026Updated 2 months ago
kiwicom / pg2avro
View on GitHub
Utility generating avro files from postgres
☆17Jul 9, 2024Updated 2 years ago
Granitosaurus / playwright-stealth
View on GitHub
☆148Nov 6, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
scrapinghub / product-extraction-benchmark
View on GitHub
☆16Apr 10, 2026Updated 3 months ago
ad2476 / pos-research
View on GitHub
Semi-supervised POS tagger for Sanskrit
☆10Aug 22, 2016Updated 9 years ago
scrapy / parsel
View on GitHub
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
☆1,347Updated this week
dev0x13 / pywuffs
View on GitHub
Python bindings for Wuffs the Library
☆19Apr 5, 2025Updated last year
jondot / formation
View on GitHub
A generic functional middleware infrastructure for Python.
☆17Jan 26, 2023Updated 3 years ago
sanskrit / vyakarana
View on GitHub
A Paninian simulator
☆20Jun 2, 2014Updated 12 years ago
liaocyintl / web-segment
View on GitHub
Segment a HTML document into structural data
☆12Jan 15, 2019Updated 7 years ago