masukomi / arc90-readability
A copy of the original Arc90 repo with links to many of the current ports.
β224Updated 7 months ago
Alternatives and similar repositories for arc90-readability:
Users that are interested in arc90-readability are comparing it to the libraries listed below
- Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)β204Updated 9 months ago
- π Turn any web page into a clean viewβ2,504Updated 3 years ago
- π A drop-in replacement for the Postlight Parser API.β283Updated 2 years ago
- Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.β381Updated last week
- Node proxy server attempting to fetch readable contents from any provided URL.β102Updated 7 years ago
- A fork of the Arc90 Labs Readability bookmarkletβ79Updated 6 years ago
- Distills the DOMβ655Updated 3 years ago
- An exercise in unsupervised machine learning: Extract Article's Text in HTml documents.β433Updated 11 months ago
- A bundle of html content extraction algorithmsβ121Updated 9 years ago
- Html Content / Article Extractor in Scala - open sourced from Gravity Labs - http://gravity.comβ343Updated 5 years ago
- Work in progress transmit from Google Codeβ1,114Updated 7 years ago
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pagesβ543Updated 3 years ago
- A port of the arclabs 'readability' package to Javaβ72Updated 12 years ago
- A python library detect and extract listing data from HTML page.β109Updated 7 years ago
- Manually compare various readable web extractor libraries against different websitesβ21Updated 2 years ago
- FeedHQ is a web-based feed readerβ576Updated 2 years ago
- C library for handling Kindle (MOBI) formats of ebook documentsβ432Updated 3 months ago
- A collection of tools to help with the Google Reader shutdown.β467Updated 6 years ago
- Extract data from websites using basic statistical magicβ506Updated 4 years ago
- Web Content Extraction Through Machine Learningβ185Updated 10 years ago
- Utilities for extracting notes from Notes.app. This repository is lightly maintained and mainly exists to serve as documentation and starβ¦β231Updated 2 years ago
- Repository for Pipesβ269Updated 6 months ago
- Full-Text RSS can transform partial feeds to deliver the full content stripped of clutter and adsβ166Updated 8 years ago
- Diff, Match and Patch Library (original at http://google.com/p/google-diff-match-patch)β204Updated 9 years ago
- Rewrite of fantastic Soulver applicationβ139Updated 12 years ago
- A language detection Web Serviceβ53Updated 7 years ago
- The Hypothesis browser extensions.β497Updated this week
- Example repository for creating your own RSS feeds using Feed me up, Scotty!β140Updated this week
- Snapshots a web page to get it as a static, self-contained HTML document.β292Updated 2 years ago
- Chrome extension to "Create WARC files from any webpage"β217Updated last year