Alir3z4 / html2textLinks
Convert HTML to Markdown-formatted text.
☆2,013Updated 2 months ago
Alternatives and similar repositories for html2text
Users that are interested in html2text are comparing it to the libraries listed below
Sorting:
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,810Updated last month
- Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors☆1,235Updated last month
- Parse feeds in Python☆2,142Updated last week
- Convert HTML to Markdown-formatted text.☆2,724Updated last year
- 🌐 URL parsing and manipulation made easy.☆2,690Updated 3 months ago
- extract text from any document. no muss. no fuss.☆4,174Updated 6 months ago
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,594Updated 2 months ago
- Convert HTML to Markdown☆1,685Updated 2 weeks ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html☆873Updated 6 months ago
- Thin wrapper for "pandoc" (MIT)☆1,001Updated 2 weeks ago
- Requests + Gevent = <3☆4,561Updated 10 months ago
- A service daemon to run Scrapy spiders☆3,040Updated 2 months ago
- Heuristic based boilerplate removal tool☆785Updated 4 months ago
- Web Content Retrieval for Humans™☆621Updated 2 years ago
- Safely pass trusted data to untrusted environments and back.☆3,022Updated 2 weeks ago
- Scrapy+Splash for JavaScript integration☆3,213Updated 4 months ago
- Useful extensions to the standard Python datetime features☆2,476Updated 2 months ago
- The easy way to send notifications☆2,706Updated last week
- Python character encoding detector☆2,267Updated 5 months ago
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…☆4,419Updated last month
- Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).☆1,312Updated last week
- Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.☆2,558Updated 10 months ago
- Python module for cross-platform clipboard functions.☆1,766Updated last year
- A fast, extensible and spec-compliant Markdown parser in pure Python.☆932Updated 2 months ago
- Headless chrome/chromium automation library (unofficial port of puppeteer)☆3,863Updated last year
- Wkhtmltopdf python wrapper to convert html to pdf☆2,023Updated last year
- A fast yet powerful Python Markdown parser with renderers and plugins.☆2,815Updated last month
- A jquery-like library for python☆2,357Updated 10 months ago
- Fixes mojibake and other glitches in Unicode text, after the fact.☆3,922Updated 8 months ago
- JMESPath is a query language for JSON.☆2,323Updated last year