Scrapy spider middleware to clean up query parameters in request URLs
☆24Jun 30, 2016Updated 9 years ago
Alternatives and similar repositories for scrapy-querycleaner
Users that are interested in scrapy-querycleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Scrapy spider middleware to split an item into multiple items using a multi-valued key☆21Feb 8, 2017Updated 9 years ago
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆57Mar 16, 2022Updated 4 years ago
- A scrapy extension to store requests and responses information in storage service☆27Mar 11, 2022Updated 4 years ago
- An efficient simhash implementation for python☆128Oct 25, 2019Updated 6 years ago
- Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls☆276Feb 26, 2025Updated last year
- A scrapy extension to sync `.scrapy` folder to an S3 bucket☆18Mar 28, 2022Updated 3 years ago
- A simple algorithm for clustering web pages, suitable for crawlers☆35Mar 6, 2017Updated 9 years ago
- ☆19Oct 12, 2016Updated 9 years ago
- ☆29Apr 28, 2021Updated 4 years ago
- Python clients for Zyte AutoExtract API☆41Jan 17, 2022Updated 4 years ago
- Feature switches in Django☆37Jun 21, 2021Updated 4 years ago
- High Level Kafka Scanner☆19Sep 29, 2017Updated 8 years ago
- A project to attempt to automatically login to a website given a single seed☆11Jun 17, 2024Updated last year
- ☆10Nov 18, 2021Updated 4 years ago
- Paginating the web☆37Feb 11, 2014Updated 12 years ago
- CLI to take the toil out of software development☆16Jan 7, 2025Updated last year
- a starter project that supports social authentication☆17Mar 15, 2016Updated 10 years ago
- Demo of orchestrating Airbyte connections with Prefect☆11Mar 3, 2022Updated 4 years ago
- MongoDB Manager for Django: providing native Django ORM support for Mongo DB.☆30Dec 26, 2022Updated 3 years ago
- Contain the class `ctx.App` that exposes the Spring context statically☆14Jun 4, 2020Updated 5 years ago
- Easily and efficiently extract deeply nested data in Rust☆32Updated this week
- DjangoCMS Comments Module☆11Dec 26, 2022Updated 3 years ago
- ☆21May 2, 2023Updated 2 years ago
- small fastcdc implementation in c99☆17Dec 31, 2022Updated 3 years ago
- A yeoman-based template to generate a great documentation website☆11Feb 3, 2023Updated 3 years ago
- Tutorial on how to create a twitter bot that replied to mentions☆10Sep 16, 2023Updated 2 years ago
- MongoDB extensions for Scrapy☆44Oct 2, 2014Updated 11 years ago
- A simple Django app, for logging Javascript client side errors☆23Oct 17, 2022Updated 3 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40May 21, 2024Updated last year
- Lets you transfer files and directories from your computer to your mobile device by scanning a QR code right from the terminal.☆13Dec 11, 2023Updated 2 years ago
- This sample allows to deploy the LiteralAI platform on azure in a few minutes. Literal AI is an observability and evaluation platform for…☆13Jul 11, 2024Updated last year
- Parsing JavaScript objects into Python data structures☆218Aug 4, 2025Updated 7 months ago
- Generate deterministic color from any object☆16Mar 12, 2026Updated last week
- Music/Audio player built in HTML5 that can play local files☆16Jun 23, 2012Updated 13 years ago
- Scrapy schema validation pipeline and Item builder using JSON Schema☆45Mar 26, 2021Updated 4 years ago
- Automatic Route53 updates based on EC2 Autoscaling state changes☆10Dec 10, 2017Updated 8 years ago
- ☆50Apr 4, 2022Updated 3 years ago
- cli for evaluating css and xpath selectors☆29Jul 4, 2023Updated 2 years ago
- Gatsby source plugin for consuming data from Google Sheets☆19Jan 3, 2023Updated 3 years ago