Remove DIVs, style stuff and normalize HTML preserving structure information
☆14Oct 24, 2025Updated 6 months ago
Alternatives and similar repositories for clear-html
Users that are interested in clear-html are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- https://mimesniff.spec.whatwg.org/ implementation for Python☆13Jan 16, 2024Updated 2 years ago
- ☆14Apr 22, 2026Updated last week
- QMPDClient official repository☆38Nov 18, 2015Updated 10 years ago
- Python port of SymSpell☆17Feb 22, 2019Updated 7 years ago
- ☆10Jun 17, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Zyte API integration for Scrapy☆40Apr 23, 2026Updated last week
- Spider templates for automatic crawlers.☆34Mar 26, 2026Updated last month
- Zhouyi model zoo (Maintained at https://github.com/Arm-China/Model_zoo)☆12Dec 30, 2024Updated last year
- Migrated to: https://codeberg.org/openculinary/knowledge-graph☆11Aug 21, 2025Updated 8 months ago
- A linter for Scrapy projects.☆22Feb 25, 2026Updated 2 months ago
- Web scraping Page Objects core library☆105Apr 21, 2026Updated last week
- A flutter package for showing quick interactions for any widget☆14Sep 25, 2023Updated 2 years ago
- An AI-powered GitHub search tool utilising Generative UI☆14Jul 20, 2024Updated last year
- Control your Home Assistant media players from your desktop using MPRIS☆33Aug 23, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Turn television drama into storyworld knowledge graphs☆27Apr 19, 2025Updated last year
- lightweight LAMA inference wrapper☆27Sep 28, 2023Updated 2 years ago
- 一个美观、简单、易用、易二次创作的ESP8266固件!Star、Fork、Follow 三连!!!☆15Feb 10, 2019Updated 7 years ago
- Automatic unit test generation for Scrapy.☆57Jul 12, 2021Updated 4 years ago
- A community visualisation for Google Data Studio in the style of the site speed auditing tool Lighthouse gauges.☆21Feb 5, 2023Updated 3 years ago
- Describes a methodology for use with SHACL 1.2, including reifications☆35Mar 2, 2026Updated last month
- A list of delightful MINDSTORMS software and resources☆14Mar 10, 2025Updated last year
- Agent based market simulation☆15Aug 10, 2024Updated last year
- A pure-Python robots.txt parser with support for modern conventions.☆86Jan 29, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Generate standalone HTML from OpenAPI Specification.☆27Jul 13, 2025Updated 9 months ago
- Apache Pekko based web crawler that uses Playwright to crawl websites and extract text data and links for further processing.☆22Aug 12, 2025Updated 8 months ago
- Official TypeScript/JavaScript SDK for the Supadata API.☆22Feb 23, 2026Updated 2 months ago
- A Dart & Flutter package for translating numbers and dates into a human readable format.☆18Sep 24, 2025Updated 7 months ago
- Library to populate items using XPath and CSS with a convenient API☆48Jan 29, 2026Updated 3 months ago
- Default Twisted does not ship with a CONNECT-enabled HTTP(s) proxy. This code provides one.☆51Feb 21, 2017Updated 9 years ago
- Scrape Airbnb, Booking, Hotels.com from a single JavaScript module. ❗No longer maintained.☆18Apr 18, 2023Updated 3 years ago
- Finetuning Whisper ASR model for Belarusian language☆17Feb 16, 2025Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆24Oct 10, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The Florence Tool CLI provides a command-line interface for processing images using the Florence-2 model. This tool allows users to apply…☆16Jan 21, 2025Updated last year
- An accurate, extensible, and fast HTML-to-markdown converter.☆23Feb 7, 2026Updated 2 months ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆16Apr 14, 2026Updated 2 weeks ago
- Remove clutter from URLs and return a canonicalized version☆21Jun 3, 2024Updated last year
- LAiSER is a tool that helps learners, educators and employers share trusted and mutually intelligible information about skills.☆10Mar 31, 2026Updated last month
- Simple program to get A LOT OF invites to https://foobar.withgoogle.com/☆31Jun 12, 2019Updated 6 years ago
- Abstraction for communicating with REST API in flutter projects.☆12Mar 13, 2026Updated last month