masukomi / arc90-readability
A copy of the original Arc90 repo with links to many of the current ports.
☆214Updated 2 months ago
Related projects: ⓘ
- Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)☆204Updated 4 months ago
- 📚 Turn any web page into a clean view☆2,477Updated 3 years ago
- Since the original was abandoned to start a web service, I'm now going to attempt to maintain the JS+CSS portion☆167Updated 6 years ago
- Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.☆360Updated this week
- 🚀 A drop-in replacement for the Postlight Parser API.☆282Updated last year
- A fork of the Arc90 Labs Readability bookmarklet☆77Updated 5 years ago
- Let's bring Readability to Chrome!☆210Updated 7 years ago
- An exercise in unsupervised machine learning: Extract Article's Text in HTml documents.☆435Updated 6 months ago
- a fast and platform independent readability port (JS)☆245Updated 10 months ago
- Distills the DOM☆645Updated 2 years ago
- Work in progress transmit from Google Code☆1,107Updated 6 years ago
- [abandoned] python port of arc90's readability bookmarklet☆537Updated 13 years ago
- Node proxy server attempting to fetch readable contents from any provided URL.☆104Updated 7 years ago
- Server side readability with node.js☆397Updated 13 years ago
- Readability clone in Java☆461Updated 3 years ago
- Automatically extract body content (and other cool stuff) from an html document☆2,149Updated last year
- Html Content / Article Extractor in Scala - open sourced from Gravity Labs - http://gravity.com☆343Updated 5 years ago
- Snapshots a web page to get it as a static, self-contained HTML document.☆269Updated 2 years ago
- FeedHQ is a web-based feed reader☆570Updated 2 years ago
- Article extraction benchmark: dataset and evaluation scripts☆274Updated 4 months ago
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,645Updated last month
- create a periodical .mobi, with kindlegen☆41Updated 3 years ago
- Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.☆343Updated 6 years ago
- A bundle of html content extraction algorithms☆122Updated 9 years ago
- ☆72Updated last year
- Generate EPUB books from HTML with simple API in Node.js.☆426Updated last year
- Webrecorder Desktop App!☆202Updated 3 years ago
- Just the facts -- web page content extraction☆1,244Updated 2 months ago
- WarcDB: Web crawl data as SQLite databases.☆390Updated 2 months ago
- a python readability☆276Updated 7 years ago