Remove DIVs, style stuff and normalize HTML preserving structure information
☆14Oct 24, 2025Updated 5 months ago
Alternatives and similar repositories for clear-html
Users that are interested in clear-html are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- https://mimesniff.spec.whatwg.org/ implementation for Python☆13Jan 16, 2024Updated 2 years ago
- ☆14Jan 21, 2026Updated 2 months ago
- QMPDClient official repository☆38Nov 18, 2015Updated 10 years ago
- Python port of SymSpell☆17Feb 22, 2019Updated 7 years ago
- ☆10Jun 17, 2017Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Zyte API integration for Scrapy☆40Updated this week
- Spider templates for automatic crawlers.☆34Mar 26, 2026Updated 2 weeks ago
- Zhouyi model zoo (Maintained at https://github.com/Arm-China/Model_zoo)☆12Dec 30, 2024Updated last year
- Migrated to: https://codeberg.org/openculinary/knowledge-graph☆11Aug 21, 2025Updated 7 months ago
- A linter for Scrapy projects.☆21Feb 25, 2026Updated last month
- Web scraping Page Objects core library☆105Updated this week
- A flutter package for showing quick interactions for any widget☆14Sep 25, 2023Updated 2 years ago
- An AI-powered GitHub search tool utilising Generative UI☆14Jul 20, 2024Updated last year
- Control your Home Assistant media players from your desktop using MPRIS☆33Aug 23, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Turn television drama into storyworld knowledge graphs☆20Apr 19, 2025Updated 11 months ago
- lightweight LAMA inference wrapper☆26Sep 28, 2023Updated 2 years ago
- 一个美观、简单、易用、易二次创作的ESP8266固件!Star、Fork、Follow 三连!!!☆15Feb 10, 2019Updated 7 years ago
- Automatic unit test generation for Scrapy.☆57Jul 12, 2021Updated 4 years ago
- A community visualisation for Google Data Studio in the style of the site speed auditing tool Lighthouse gauges.☆21Feb 5, 2023Updated 3 years ago
- Describes a methodology for use with SHACL 1.2, including reifications☆34Mar 2, 2026Updated last month
- A list of delightful MINDSTORMS software and resources☆16Mar 10, 2025Updated last year
- A pure-Python robots.txt parser with support for modern conventions.☆85Jan 29, 2026Updated 2 months ago
- Agent based market simulation☆15Aug 10, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Generate standalone HTML from OpenAPI Specification.☆24Jul 13, 2025Updated 8 months ago
- Apache Pekko based web crawler that uses Playwright to crawl websites and extract text data and links for further processing.☆22Aug 12, 2025Updated 7 months ago
- Git-native cross-forge collaboration: posts, issues, PRs, releases, all in your repo☆44Updated this week
- A Dart & Flutter package for translating numbers and dates into a human readable format.☆18Sep 24, 2025Updated 6 months ago
- Official TypeScript/JavaScript SDK for the Supadata API.☆21Feb 23, 2026Updated last month
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆15Updated this week
- Default Twisted does not ship with a CONNECT-enabled HTTP(s) proxy. This code provides one.☆51Feb 21, 2017Updated 9 years ago
- Library to populate items using XPath and CSS with a convenient API☆48Jan 29, 2026Updated 2 months ago
- Scrape Airbnb, Booking, Hotels.com from a single JavaScript module. ❗No longer maintained.☆18Apr 18, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Finetuning Whisper ASR model for Belarusian language☆17Feb 16, 2025Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆24Oct 10, 2024Updated last year
- The Florence Tool CLI provides a command-line interface for processing images using the Florence-2 model. This tool allows users to apply…☆16Jan 21, 2025Updated last year
- An accurate, extensible, and fast HTML-to-markdown converter.☆23Feb 7, 2026Updated 2 months ago
- Remove clutter from URLs and return a canonicalized version☆21Jun 3, 2024Updated last year
- Simple program to get A LOT OF invites to https://foobar.withgoogle.com/☆31Jun 12, 2019Updated 6 years ago
- Abstraction for communicating with REST API in flutter projects.☆12Mar 13, 2026Updated 3 weeks ago