Convert Javascript code to an XML document
☆187Mar 14, 2022Updated 3 years ago
Alternatives and similar repositories for js2xml
Users that are interested in js2xml are comparing it to the libraries listed below
Sorting:
- Library to populate items using XPath and CSS with a convenient API☆47Jan 29, 2026Updated last month
- Python library of web-related functions☆414Feb 19, 2026Updated last week
- Parsing JavaScript objects into Python data structures☆217Aug 4, 2025Updated 6 months ago
- Show summary of a large number of URLs in a Jupyter Notebook☆17Feb 10, 2026Updated 3 weeks ago
- Extract embedded metadata from HTML markup☆951Oct 1, 2025Updated 5 months ago
- Web scraping Page Objects core library☆104Jan 27, 2026Updated last month
- Page Object pattern for Scrapy☆127Jan 28, 2026Updated last month
- A generic crawler☆78Feb 10, 2026Updated 3 weeks ago
- A linter for Scrapy projects.☆21Updated this week
- Detect and classify pagination links☆105Feb 10, 2026Updated 3 weeks ago
- A browser extension to monitor your spiders deployed on Scrapy Cloud.☆16Mar 8, 2025Updated 11 months ago
- MongoDB extensions for Scrapy☆44Oct 2, 2014Updated 11 years ago
- Restrict crawl and scraping scope using matchers.☆26Jun 8, 2016Updated 9 years ago
- High Level Kafka Scanner☆19Sep 29, 2017Updated 8 years ago
- A project to attempt to automatically login to a website given a single seed☆11Jun 17, 2024Updated last year
- Find which links on a web page are pagination links☆29Jan 12, 2017Updated 9 years ago
- Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors☆1,315Jan 29, 2026Updated last month
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆57Mar 16, 2022Updated 3 years ago
- NER toolkit for HTML data☆259May 3, 2024Updated last year
- Extract price amount and currency symbol from a raw text string☆347Feb 12, 2026Updated 2 weeks ago
- Skinfer is a tool for inferring and merging JSON schemas☆141Apr 24, 2024Updated last year
- Crochet-based blocking API for Scrapy.☆46Feb 24, 2017Updated 9 years ago
- Scrapy spider middleware to split an item into multiple items using a multi-valued key☆21Feb 8, 2017Updated 9 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40May 21, 2024Updated last year
- Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls☆277Feb 26, 2025Updated last year
- Sentry component for Scrapy☆86Aug 21, 2023Updated 2 years ago
- Pluggable DSL that uses pipes to perform a series of linear transformations to extract data☆16Jul 9, 2024Updated last year
- Extensions for using Scrapy on Amazon AWS☆32Dec 5, 2012Updated 13 years ago
- Scrapy Extension for monitoring spiders execution.☆553Updated this week
- HTTP API for Scrapy spiders☆879Feb 16, 2026Updated 2 weeks ago
- A scrapy extension to store requests and responses information in storage service☆27Mar 11, 2022Updated 3 years ago
- Intelligent Web Data Extractor☆74Dec 5, 2022Updated 3 years ago
- python parser for human readable dates☆2,788Updated this week
- Scrapy spider middleware to clean up query parameters in request URLs☆24Jun 30, 2016Updated 9 years ago
- A client interface for Scrapinghub's API☆205Oct 3, 2025Updated 5 months ago
- A Scrapy extension to log items coverage when the spider shuts down☆19Apr 11, 2020Updated 5 years ago
- Detect and classify pagination links☆15Sep 9, 2020Updated 5 years ago
- A python library detect and extract listing data from HTML page.☆108May 5, 2017Updated 8 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41May 29, 2017Updated 8 years ago