Alir3z4 / html2textLinks
Convert HTML to Markdown-formatted text.
☆2,070Updated 6 months ago
Alternatives and similar repositories for html2text
Users that are interested in html2text are comparing it to the libraries listed below
Sorting:
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,849Updated 5 months ago
- Convert HTML to Markdown-formatted text.☆2,761Updated last year
- Convert HTML to Markdown☆1,823Updated 2 months ago
- Parse feeds in Python☆2,218Updated 2 weeks ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html☆890Updated last month
- extract text from any document. no muss. no fuss.☆4,338Updated 10 months ago
- python parser for human readable dates☆2,730Updated 2 months ago
- A fast yet powerful Python Markdown parser with renderers and plugins.☆2,881Updated last month
- Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors☆1,277Updated last month
- Convert Word documents (.docx files) to HTML☆1,010Updated last month
- Thin wrapper for "pandoc" (MIT)☆1,054Updated 2 weeks ago
- Returns unicode slugs☆1,554Updated 3 weeks ago
- A python wrapper for libmagic☆2,826Updated last week
- A python based HTML to text conversion library, command line client and Web service.☆322Updated 2 months ago
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,627Updated 6 months ago
- A jquery-like library for python☆2,361Updated last year
- Python character encoding detector☆2,289Updated last week
- Heuristic based boilerplate removal tool☆798Updated 7 months ago
- Port of Google's language-detection library to Python.☆1,851Updated 7 months ago
- Simple PDF text extraction☆954Updated 8 months ago
- emoji terminal output for Python☆2,004Updated last month
- Persistent HTTP cache for python requests☆1,457Updated last week
- Useful extensions to the standard Python datetime features☆2,548Updated last month
- A Python implementation of John Gruber’s Markdown with Extension support.☆4,101Updated 3 weeks ago
- Just the facts -- web page content extraction☆1,273Updated 3 months ago
- Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).☆1,935Updated this week
- markdown2: A fast and complete implementation of Markdown in Python☆2,796Updated last week
- Python binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors for Python.☆1,447Updated last week
- The official source code for the python-mechanize project☆754Updated 4 months ago
- Web Content Retrieval for Humans™☆623Updated 3 years ago