scrapy / parsel
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
☆1,149Updated last month
Related projects ⓘ
Alternatives and complementary repositories for parsel
- Integration layer between Requests and Selenium for automation of web actions.☆1,834Updated 8 months ago
- A toolbelt of useful classes and functions to be used with python-requests☆999Updated 7 months ago
- Command line client for Scrapyd server☆770Updated last month
- Python library of web-related functions☆393Updated last month
- Run JavaScript code from Python (EOL: https://gist.github.com/doloopwhile/8c6ec7dd4703e8a44e559411cb2ea221)☆708Updated 4 years ago
- Async requests-like httplib for python.☆508Updated 2 years ago
- Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).☆1,162Updated last week
- A jquery-like library for python☆2,300Updated 2 months ago
- HTTP API for Scrapy spiders☆836Updated 4 months ago
- Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to …☆1,917Updated 3 years ago
- 🎭 Playwright integration for Scrapy☆1,031Updated last week
- Extends Selenium WebDriver classes to include the request function from the Requests library, while doing all the needed cookie and reque…☆494Updated 8 months ago
- 🌐 URL parsing and manipulation made easy.☆2,645Updated this week
- Asynchronous Python HTTP Requests for Humans using Futures☆2,108Updated last week
- Fast HTTP parser☆1,205Updated last month
- 🎭 Twisted Deferred Thread backend for Requests.☆417Updated 5 years ago
- Scrapy Extension for monitoring spiders execution.☆533Updated last week
- Requests + Gevent = <3☆4,490Updated 3 months ago
- Subprocesses for Humans 2.0.☆1,701Updated last year
- A service daemon to run Scrapy spiders☆2,971Updated last week
- aiomysql is a library for accessing a MySQL database from the asyncio☆1,765Updated 3 weeks ago
- Random User-Agent middleware based on fake-useragent☆688Updated last year
- Web Scraping Framework☆2,394Updated 8 months ago
- Lightweight Python utilities for working with Redis☆1,155Updated last month
- A fast Python in-process signal/event dispatching system.☆1,798Updated 2 weeks ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html☆833Updated 3 months ago
- File support for asyncio☆2,864Updated last week
- Yet another URL library☆1,336Updated this week
- Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls☆268Updated 3 years ago
- Useful extensions to the standard Python datetime features☆2,372Updated 3 months ago