Convert HTML to Markdown-formatted text.
☆2,130Oct 28, 2025Updated 4 months ago
Alternatives and similar repositories for html2text
Users that are interested in html2text are comparing it to the libraries listed below
Sorting:
- Convert HTML to Markdown-formatted text.☆2,883Feb 27, 2024Updated 2 years ago
- Web Content Retrieval for Humans™☆630Jul 30, 2022Updated 3 years ago
- Convert HTML to Markdown☆2,076Nov 16, 2025Updated 3 months ago
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,889Jan 26, 2026Updated last month
- A python based HTML to text conversion library, command line client and Web service.☆337Nov 18, 2025Updated 3 months ago
- extract text from any document. no muss. no fuss.☆4,458Feb 4, 2026Updated 3 weeks ago
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…☆5,337Sep 12, 2025Updated 5 months ago
- newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:☆14,996Dec 6, 2025Updated 2 months ago
- a small library for extracting rich content from urls☆673Jan 7, 2026Updated last month
- Parse feeds in Python☆2,308Feb 2, 2026Updated last month
- Heuristic based boilerplate removal tool☆811Feb 25, 2025Updated last year
- Module for automatic summarization of text documents and HTML pages.☆3,661Feb 14, 2026Updated 2 weeks ago
- Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes☆2,757Feb 2, 2026Updated last month
- A fast yet powerful Python Markdown parser with renderers and plugins.☆2,984Feb 22, 2026Updated last week
- Pythonic HTML Parsing for Humans™☆13,874Apr 16, 2024Updated last year
- A Python implementation of John Gruber’s Markdown with Extension support.☆4,173Feb 9, 2026Updated 3 weeks ago
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆9,839Updated this week
- Fuzzy String Matching in Python☆9,270Feb 24, 2023Updated 3 years ago
- A python wrapper for libmagic