Alir3z4/html2text

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Alir3z4/html2text)

Alir3z4 / html2text

Convert HTML to Markdown-formatted text.

☆2,169

Alternatives and similar repositories for html2text

Users that are interested in html2text are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

aaronsw / html2text
View on GitHub
Convert HTML to Markdown-formatted text.
☆2,817Feb 27, 2024Updated 2 years ago
michaelhelmick / lassie
View on GitHub
Web Content Retrieval for Humans™
☆629Jul 30, 2022Updated 3 years ago
buriy / python-readability
View on GitHub
fast python port of arc90's readability tool, updated to match latest readability.js!
☆2,894Jan 26, 2026Updated 5 months ago
matthewwithanm / python-markdownify
View on GitHub
Convert HTML to Markdown
☆2,219Jun 30, 2026Updated 2 weeks ago
deanmalmgren / textract
View on GitHub
extract text from any document. no muss. no fuss.
☆4,663Jul 11, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
weblyzard / inscriptis
View on GitHub
A python based HTML to text conversion library, command line client and Web service.
☆345Jun 22, 2026Updated 3 weeks ago
adbar / trafilatura
View on GitHub
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…
☆6,295Jul 9, 2026Updated last week
coleifer / micawber
View on GitHub
a small library for extracting rich content from urls
☆681Jul 5, 2026Updated last week
codelucas / newspaper
View on GitHub
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
☆15,100Jul 8, 2026Updated last week
miso-belica / jusText
View on GitHub
Heuristic based boilerplate removal tool
☆818Feb 25, 2025Updated last year
kurtmckee / feedparser
View on GitHub
Parse feeds in Python
☆2,401Jul 6, 2026Updated last week
mozilla / bleach
View on GitHub
Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
☆2,772Jun 5, 2026Updated last month
miso-belica / sumy
View on GitHub
Module for automatic summarization of text documents and HTML pages.
☆3,694Jun 23, 2026Updated 3 weeks ago
psf / requests-html
View on GitHub
Pythonic HTML Parsing for Humans™
☆13,827Apr 16, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lepture / mistune
View on GitHub
A fast yet powerful Python Markdown parser with renderers and plugins.
☆3,056Jul 9, 2026Updated last week
MechanicalSoup / MechanicalSoup
View on GitHub
A Python library for automating interaction with websites.
☆4,876Jun 26, 2026Updated 3 weeks ago
Python-Markdown / markdown
View on GitHub
A Python implementation of John Gruber’s Markdown with Extension support.
☆4,225Jul 8, 2026Updated last week
ahupp / python-magic
View on GitHub
A python wrapper for libmagic
☆2,914Jun 22, 2026Updated 3 weeks ago
trentm / python-markdown2
View on GitHub
markdown2: A fast and complete implementation of Markdown in Python
☆2,819Updated this week
py-pdf / pypdf
View on GitHub
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
☆10,117Jun 30, 2026Updated 2 weeks ago
encode / httpx
View on GitHub
A next generation HTTP client for Python. 🦋
☆15,355Mar 29, 2026Updated 3 months ago
seatgeek / fuzzywuzzy
View on GitHub
Fuzzy String Matching in Python
☆9,260Feb 24, 2023Updated 3 years ago
html5lib / html5lib-python
View on GitHub
Standards-compliant library for parsing and serializing HTML documents and fragments in Python
☆1,223Apr 21, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
dateutil / dateutil
View on GitHub
Useful extensions to the standard Python datetime features
☆2,632May 19, 2026Updated last month
jd / tenacity
View on GitHub
Retrying library for Python
☆8,715Updated this week
jazzband / pip-tools
View on GitHub
A set of tools to keep your pinned Python dependencies fresh.
☆8,009Updated this week
scrapy / parsel
View on GitHub
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
☆1,341Jul 6, 2026Updated last week
pdfminer / pdfminer.six
View on GitHub
Community maintained fork of pdfminer - we fathom PDF
☆7,002Mar 13, 2026Updated 4 months ago
gruns / furl
View on GitHub
🌐 The easiest way to parse and modify URLs in Python.
☆2,808Feb 22, 2026Updated 4 months ago
jazzband / tablib
View on GitHub
Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
☆4,757Updated this week
pallets / click
View on GitHub
Python composable command line interface toolkit
☆17,583Jul 10, 2026Updated last week
Bogdanp / dramatiq
View on GitHub
A fast and reliable background task processing library for Python 3.
☆5,287Jul 6, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
grantjenks / python-diskcache
View on GitHub
Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.
☆2,896Aug 10, 2024Updated last year
python-poetry / poetry
View on GitHub
Python packaging and dependency management made easy
☆34,296Updated this week
scrapinghub / dateparser
View on GitHub
python parser for human readable dates
☆2,844Updated this week
python-pendulum / pendulum
View on GitHub
Python datetimes made easy
☆6,671Jul 6, 2026Updated last week
rspeer / python-ftfy
View on GitHub
Fixes mojibake and other glitches in Unicode text, after the fact.
☆4,052Oct 30, 2024Updated last year
martinblech / xmltodict
View on GitHub
Python module that makes working with XML feel like you are working with JSON
☆5,746Jun 15, 2026Updated last month
coleifer / huey
View on GitHub
a little task queue for python
☆5,985Jul 12, 2026Updated last week