Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
☆205May 9, 2024Updated last year
Alternatives and similar repositories for breadability
Users that are interested in breadability are comparing it to the libraries listed below
Sorting:
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,894Jan 26, 2026Updated last month
- Python wrapper for the Readability API.☆134Sep 8, 2021Updated 4 years ago
- a python readability☆277Jun 22, 2017Updated 8 years ago
- python-readability, but faster (mirror-ish)☆82Jan 24, 2012Updated 14 years ago
- [unmaintained] Python version of arc90's *older* readability.js☆47Oct 30, 2011Updated 14 years ago
- [abandoned] python port of arc90's readability bookmarklet☆543Jun 16, 2011Updated 14 years ago
- An exercise in unsupervised machine learning: Extract Article's Text in HTml documents.☆431Jan 16, 2026Updated 2 months ago
- Html Content / Article Extractor, web scrapping lib in Python☆4,070Mar 10, 2026Updated last week
- The more often you click a word in the headlines, the more interesting are your news.☆13Mar 27, 2017Updated 8 years ago
- Framework for evaluating text extraction algorithms implemented as web services☆42Jun 30, 2012Updated 13 years ago
- Readability/Boilerpipe extraction in Python☆55May 6, 2016Updated 9 years ago
- Just the facts -- web page content extraction☆1,279Jul 8, 2025Updated 8 months ago
- C library for efficient string matching with Aho-Corasick☆21Jan 20, 2012Updated 14 years ago
- newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:☆15,010Dec 6, 2025Updated 3 months ago
- Heuristic based boilerplate removal tool☆814Feb 25, 2025Updated last year
- Demonstration of gpt-2 model with flask+uwsgi+nginx in web environment containerized in docker for quick deployment.☆13Mar 24, 2023Updated 2 years ago
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages☆32Sep 2, 2016Updated 9 years ago
- A web scraper in Python using Django and Celery☆16May 12, 2013Updated 12 years ago
- EPWING dictionary viewer☆11Nov 13, 2018Updated 7 years ago
- 📚 Turn any web page into a clean view☆2,521Apr 3, 2021Updated 4 years ago
- ☆13Dec 4, 2019Updated 6 years ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html☆904Feb 6, 2026Updated last month
- A port of the arclabs 'readability' package to Java☆72Sep 10, 2012Updated 13 years ago
- Article extraction benchmark: dataset and evaluation scripts☆356Mar 1, 2026Updated 3 weeks ago
- Module for automatic summarization of text documents and HTML pages.☆3,664Feb 14, 2026Updated last month
- A Python application to download Google Books and convert them in PDF in a given folder.☆11Oct 21, 2013Updated 12 years ago
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages☆541Jul 17, 2021Updated 4 years ago
- A fast JSON to Markup Engine☆60May 6, 2011Updated 14 years ago
- The New Yet another Readline for Go☆34Feb 21, 2026Updated last month
- Quill OS root filesystem☆21Jan 4, 2026Updated 2 months ago
- A collection of Dashboard modules for Django Admin Tools, ncludes dashboards for Memcache statistics, Varnish statistics, and RSS dashboa…☆29Jan 11, 2012Updated 14 years ago
- ☆118Feb 4, 2026Updated last month
- A really simple wiki engine with bottlepy☆13Mar 10, 2012Updated 14 years ago
- Styles for TaskPaper 3☆11Jan 27, 2018Updated 8 years ago
- Fast text chunking algorithms for Python☆12Oct 7, 2020Updated 5 years ago
- Scripts for Glyphs.app☆12Jan 20, 2026Updated 2 months ago
- Backup small important files to paper☆33Mar 30, 2023Updated 2 years ago
- GitHub Action that builds OCP models (CadQuery/Build123d/...), renders them and sets up a model viewer on Github Pages.☆11Mar 30, 2024Updated last year
- An Atom package for creating a zettelkasten style wiki. Should be used with my Academic-Markdown syntax file☆12Jun 3, 2021Updated 4 years ago