Alir3z4 / html2text
Convert HTML to Markdown-formatted text.
☆1,962Updated this week
Alternatives and similar repositories for html2text:
Users that are interested in html2text are comparing it to the libraries listed below
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,759Updated 3 months ago
- Convert HTML to Markdown-formatted text.☆2,709Updated last year
- Useful extensions to the standard Python datetime features☆2,442Updated 2 weeks ago
- extract text from any document. no muss. no fuss.☆4,072Updated 4 months ago
- Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors☆1,216Updated 2 weeks ago
- Returns unicode slugs☆1,526Updated last year
- Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.☆4,686Updated last week
- Wkhtmltopdf python wrapper to convert html to pdf☆2,016Updated last year
- Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes☆2,684Updated last week
- A fast yet powerful Python Markdown parser with renderers and plugins.☆2,752Updated 2 weeks ago
- Convert HTML to Markdown☆1,553Updated 3 weeks ago
- Safely pass trusted data to untrusted environments and back.☆2,995Updated 3 months ago
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆8,952Updated this week
- A generator library for concise, unambiguous and URL-safe UUIDs.☆2,120Updated 4 months ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html☆862Updated 3 months ago
- Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to …☆1,922Updated 3 years ago
- Datetimes for Humans™☆3,416Updated 9 months ago
- Fixes mojibake and other glitches in Unicode text, after the fact.☆3,898Updated 5 months ago
- Python character encoding detector☆2,243Updated 3 months ago
- python parser for human readable dates☆2,639Updated 3 weeks ago
- The lxml XML toolkit for Python☆2,808Updated this week
- Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.☆2,518Updated 8 months ago
- Parse feeds in Python☆2,092Updated last week
- 🌐 URL parsing and manipulation made easy.☆2,676Updated 3 weeks ago
- Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).☆1,887Updated last week
- A jquery-like library for python☆2,349Updated 7 months ago
- Standards-compliant library for parsing and serializing HTML documents and fragments in Python☆1,189Updated last year
- Python Classes Without Boilerplate☆5,454Updated last week
- Automatically mock your HTTP interactions to simplify and speed up testing☆2,783Updated this week
- Requests + Gevent = <3☆4,555Updated 8 months ago