scrapy/w3lib

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrapy/w3lib)

scrapy / w3lib

Python library of web-related functions

☆419

Alternatives and similar repositories for w3lib

Users that are interested in w3lib are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

scrapy / itemloaders
View on GitHub
Library to populate items using XPath and CSS with a convenient API
☆49Updated this week
scrapinghub / js2xml
View on GitHub
Convert Javascript code to an XML document
☆188Mar 14, 2022Updated 4 years ago
scrapy / parsel
View on GitHub
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
☆1,343Updated this week
scrapy / scrapely
View on GitHub
A pure-python HTML screen-scraping library
☆1,884Apr 4, 2022Updated 4 years ago
scrapy / protego
View on GitHub
A pure-Python robots.txt parser with support for modern conventions.
☆90Updated this week
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
scrapy / queuelib
View on GitHub
Collection of persistent (disk-based) and non-persistent (memory-based) queues for Python
☆299Jun 26, 2026Updated 3 weeks ago
scrapy / cssselect
View on GitHub
CSS Selectors for Python
☆309Updated this week
scrapinghub / scrapyrt
View on GitHub
HTTP API for Scrapy spiders
☆882Jun 29, 2026Updated 3 weeks ago
scrapy / xtractmime
View on GitHub
https://mimesniff.spec.whatwg.org/ implementation for Python
☆13Jul 9, 2026Updated last week
scrapinghub / extruct
View on GitHub
Extract embedded metadata from HTML markup
☆966Apr 1, 2026Updated 3 months ago
redapple / parslepy
View on GitHub
Python implementation of the Parsley language for extracting structured data from web pages
☆92Oct 26, 2017Updated 8 years ago
scrapinghub / scmongo
View on GitHub
MongoDB extensions for Scrapy
☆44Oct 2, 2014Updated 11 years ago
scrapy / loginform
View on GitHub
Fill HTML login forms automatically
☆279Apr 24, 2024Updated 2 years ago
scrapinghub / splash
View on GitHub
Lightweight, scriptable browser as a service with an HTTP API
☆4,190Aug 2, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
scrapinghub / python-scrapinghub
View on GitHub
A client interface for Scrapinghub's API
☆206Jul 14, 2026Updated last week
scrapinghub / skinfer
View on GitHub
Skinfer is a tool for inferring and merging JSON schemas
☆141Apr 24, 2024Updated 2 years ago
scrapinghub / dateparser
View on GitHub
python parser for human readable dates
☆2,843Updated this week
scrapinghub / web-poet
View on GitHub
Web scraping Page Objects core library
☆107Jul 10, 2026Updated last week
scrapy / scurl
View on GitHub
Performance-focused replacement for Python urllib
☆21Apr 13, 2026Updated 3 months ago
scrapinghub / scrapy-poet
View on GitHub
Page Object pattern for Scrapy
☆127Jun 8, 2026Updated last month
scrapinghub / spidermon
View on GitHub
Scrapy Extension for monitoring spiders execution.
☆561May 28, 2026Updated last month
zytedata / zyte-autoextract
View on GitHub
Python clients for Zyte AutoExtract API
☆41Jan 17, 2022Updated 4 years ago
scrapinghub / shub
View on GitHub
Scrapinghub Command Line Client
☆129Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
scrapy / scrapyd
View on GitHub
A service daemon to run Scrapy spiders
☆3,097Updated this week
scrapy / scrapyd-client
View on GitHub
Command line client for Scrapyd server
☆772Feb 27, 2026Updated 4 months ago
scrapy-plugins / scrapy-splash
View on GitHub
Scrapy+Splash for JavaScript integration
☆3,229Feb 11, 2025Updated last year
rmax / scrapy-boilerplate
View on GitHub
Small set of utilities to simplify writing Scrapy spiders.
☆50Jul 24, 2015Updated 10 years ago
TeamHG-Memex / soft404
View on GitHub
A classifier for detecting soft 404 pages
☆65Apr 8, 2026Updated 3 months ago
TeamHG-Memex / Formasaurus
View on GitHub
Formasaurus tells you the type of an HTML form and its fields using machine learning
☆121Apr 8, 2026Updated 3 months ago
scrapy-plugins / scrapy-querycleaner
View on GitHub
Scrapy spider middleware to clean up query parameters in request URLs
☆24Jun 30, 2016Updated 10 years ago
TeamHG-Memex / autopager
View on GitHub
Detect and classify pagination links
☆107Apr 8, 2026Updated 3 months ago
scrapinghub / webstruct
View on GitHub
NER toolkit for HTML data
☆259May 3, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
llonchj / scrapy-sentry
View on GitHub
Sentry component for Scrapy
☆84Aug 21, 2023Updated 2 years ago
scrapinghub / scrapinghub-entrypoint-scrapy
View on GitHub
Scrapy entrypoint for Scrapinghub job runner
☆24Feb 26, 2026Updated 4 months ago
scrapinghub / scrapylib
View on GitHub
Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)
☆33Feb 22, 2018Updated 8 years ago
scrapy / scrapy-lint
View on GitHub
A linter for Scrapy projects.
☆22Jul 7, 2026Updated 2 weeks ago
TeamHG-Memex / url-summary
View on GitHub
Show summary of a large number of URLs in a Jupyter Notebook
☆19Apr 8, 2026Updated 3 months ago
scrapinghub / arche
View on GitHub
Analyze scraped data
☆47Dec 9, 2019Updated 6 years ago
scrapinghub / number-parser
View on GitHub
Parse numbers written in natural language
☆130Oct 23, 2024Updated last year