Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.
☆57Mar 16, 2022Updated 4 years ago
Alternatives and similar repositories for scrapy-magicfields
Users that are interested in scrapy-magicfields are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Scrapy spider middleware to split an item into multiple items using a multi-valued key☆21Feb 8, 2017Updated 9 years ago
- Scrapy spider middleware to clean up query parameters in request URLs☆24Jun 30, 2016Updated 9 years ago
- A Scrapy extension to log items coverage when the spider shuts down☆19Apr 11, 2020Updated 5 years ago
- Scrapy schema validation pipeline and Item builder using JSON Schema☆45Mar 26, 2021Updated 4 years ago
- Scrapy Extension for monitoring spiders execution.☆552Mar 5, 2026Updated 2 weeks ago
- Web scraping Page Objects core library☆104Mar 10, 2026Updated last week
- A CLI for benchmarking Scrapy.☆32Jun 28, 2025Updated 8 months ago
- Page Object pattern for Scrapy☆127Mar 15, 2026Updated last week
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40May 21, 2024Updated last year
- Library designed to replace the SQLite backend by a MongoDB backend on Scrapy queue management☆17Sep 2, 2017Updated 8 years ago
- A collection of pipelines for Scrapy☆16Mar 13, 2026Updated last week
- ☆19Oct 12, 2016Updated 9 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆50Jul 24, 2015Updated 10 years ago
- Sentry component for Scrapy☆86Aug 21, 2023Updated 2 years ago
- Analyze scraped data☆46Dec 9, 2019Updated 6 years ago
- Web grep: search all rendered resources used by a URI☆89Nov 21, 2025Updated 4 months ago
- In this repository, I try to share some of the little tips and tricks and amazing spiders that I used to work with on the scrapy framewor…☆12Feb 2, 2020Updated 6 years ago
- Scrapy extension to control spiders using JSON-RPC☆299Aug 26, 2019Updated 6 years ago
- Scrapy middleware which allows to crawl only new content☆80Feb 10, 2026Updated last month
- a high-performance, lightweight and human friendly serving engine for scrapy☆29Mar 17, 2025Updated last year
- A scrapy spider for R18☆16Mar 13, 2026Updated last week
- MongoDB extensions for Scrapy☆44Oct 2, 2014Updated 11 years ago
- Useful test spiders for Scrapy☆184Jan 20, 2020Updated 6 years ago
- A declarative data-migration package☆16Dec 7, 2024Updated last year
- Collection of plugins for https://github.com/dongweiming/wechat-admin☆14Jul 24, 2017Updated 8 years ago
- A Django Debug Toolbar panel for Haystack☆40Feb 25, 2014Updated 12 years ago
- Extract embedded metadata from HTML markup☆956Oct 1, 2025Updated 5 months ago
- Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy☆366Mar 24, 2025Updated 11 months ago
- Collection of persistent (disk-based) and non-persistent (memory-based) queues for Python☆295Jan 29, 2026Updated last month
- Pre-built Scrapy spiders for AutoExtract☆19Apr 24, 2024Updated last year
- Scrapy extension to write items using sqlalchemy models☆37Apr 24, 2017Updated 8 years ago
- JAV site scrapers☆18Jul 6, 2022Updated 3 years ago
- A decorator to write coroutine-like spider callbacks.☆109Dec 26, 2022Updated 3 years ago
- Parsing JavaScript objects into Python data structures☆218Aug 4, 2025Updated 7 months ago
- Scrapy+Splash for JavaScript integration☆3,233Feb 11, 2025Updated last year
- A daemon for scheduling Scrapy spiders☆66May 28, 2021Updated 4 years ago
- MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the item…☆358Apr 6, 2021Updated 4 years ago
- an instant to crawl JD data☆12May 31, 2017Updated 8 years ago
- Smallest pdf2htmlEX container and easiest way to convert pdf to html file (246MB)☆18Oct 15, 2015Updated 10 years ago