extract difference between two html pages
☆33Apr 8, 2026Updated 3 weeks ago
Alternatives and similar repositories for extract-html-diff
Users that are interested in extract-html-diff are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Site Hound (previously THH) is a Domain Discovery Tool☆24Apr 8, 2026Updated 3 weeks ago
- A component that tries to avoid downloading duplicate content☆28Apr 8, 2026Updated 3 weeks ago
- A classifier for detecting soft 404 pages☆61Apr 8, 2026Updated 3 weeks ago
- a tor socks proxy docker image☆12Apr 8, 2026Updated 3 weeks ago
- Simple heuristic for measuring web page similarity (& data set)☆91Apr 8, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A generic crawler☆79Apr 8, 2026Updated 3 weeks ago
- Extract text from HTML☆135Apr 8, 2026Updated 3 weeks ago
- Narwhal is a keyword and KEY NARRATIVE manager that creates language-aware classes. Because Narhwal does not use NLP it avoids complexity…☆12Oct 16, 2018Updated 7 years ago
- Paginating the web☆37Feb 11, 2014Updated 12 years ago
- Python implementation of the Parsley language for extracting structured data from web pages☆92Oct 26, 2017Updated 8 years ago
- Detect and classify pagination links☆107Apr 8, 2026Updated 3 weeks ago
- An efficient approximation for tree edit-distance.☆45Sep 6, 2011Updated 14 years ago
- Adaptive crawler which uses Reinforcement Learning methods☆169Apr 8, 2026Updated 3 weeks ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆12Apr 8, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Analyze standard numbers like ARK, DOI, EAN, GTIN, IBAN, ISAN, ISBN, ISMN, ISNI, ISSN, ISTC, ISWC, ORCID, PPN, SICI, UPC, ZDB with Elasti…☆24Jul 5, 2016Updated 9 years ago
- Scrapy middleware for the autologin☆36Apr 8, 2026Updated 3 weeks ago
- Automatic Item List Extraction☆86Jun 15, 2016Updated 9 years ago
- Library for annotation-based dependency injection☆24Mar 3, 2026Updated last month
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆193Apr 29, 2022Updated 4 years ago
- Given a new image, determine if it is likely derived from a known image.☆21Apr 8, 2026Updated 3 weeks ago
- Intelligent Web Data Extractor☆74Dec 5, 2022Updated 3 years ago
- HTML5 audio/video clipper☆13Mar 7, 2018Updated 8 years ago
- ☆11May 31, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Extraction Toolkit☆83Nov 18, 2021Updated 4 years ago
- Common interface for data container classes☆69Mar 16, 2026Updated last month
- Detect and classify pagination links☆15Sep 9, 2020Updated 5 years ago
- ☆10Apr 22, 2024Updated 2 years ago
- Make a rust executable that runs on AWS lambda☆10Mar 2, 2021Updated 5 years ago
- Spider templates for automatic crawlers.☆34Mar 26, 2026Updated last month
- Python WSGI Middleware for adding HTTP/S proxy support to any WSGI Application☆24Oct 27, 2020Updated 5 years ago
- Extensions for using Scrapy on Amazon AWS☆32Dec 5, 2012Updated 13 years ago
- Price and currency parsing utility☆27Mar 6, 2023Updated 3 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Web Crawling UI and HTTP API, based on Scrapy and Tornado☆160Apr 8, 2026Updated 3 weeks ago
- Extract embedded metadata from HTML markup☆962Apr 1, 2026Updated 3 weeks ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆121Apr 8, 2026Updated 3 weeks ago
- A project to attempt to automatically login to a website given a single seed☆129Apr 8, 2026Updated 3 weeks ago
- ☆20Oct 2, 2024Updated last year
- Modules for the Stratos ERP project☆13May 15, 2023Updated 2 years ago
- Podclips is an iOS app that allows users to cut out and share clips from their favourite podcasts☆15Mar 25, 2018Updated 8 years ago