masukomi / arc90-readabilityLinks
A copy of the original Arc90 repo with links to many of the current ports.
β228Updated 11 months ago
Alternatives and similar repositories for arc90-readability
Users that are interested in arc90-readability are comparing it to the libraries listed below
Sorting:
- Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)β204Updated last year
- π Turn any web page into a clean viewβ2,515Updated 4 years ago
- Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.β400Updated this week
- π A drop-in replacement for the Postlight Parser API.β283Updated 2 years ago
- β80Updated 5 months ago
- Utilities for extracting notes from Notes.app. This repository is lightly maintained and mainly exists to serve as documentation and starβ¦β237Updated 2 years ago
- A collection of tools to help with the Google Reader shutdown.β468Updated 6 years ago
- Chrome extension to "Create WARC files from any webpage"β220Updated last year
- Manually compare various readable web extractor libraries against different websitesβ21Updated 2 years ago
- Repository for Pipesβ272Updated 10 months ago
- Automatically extract body content (and other cool stuff) from an html documentβ2,157Updated 2 years ago
- Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.β344Updated 6 years ago
- EPUB processing engine written in Javascriptβ384Updated 3 years ago
- English lemmatizerβ67Updated 2 years ago
- create a periodical .mobi, with kindlegenβ43Updated 3 years ago
- Google Chrome Extension API for Safariβ107Updated last year
- Heuristic based boilerplate removal toolβ780Updated 3 months ago
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a headβ170Updated 5 years ago
- Offline-first web browserβ89Updated 6 years ago
- jq in the browser with emscripten.β337Updated 2 months ago
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pagesβ542Updated 3 years ago
- A CLI for Mozilla Readability. Get clean, uncluttered, ready-to-read HTML from any webpage!β51Updated 2 years ago
- Combines epub-cache with epub2html in order to make Epub content readable via a browser, without special plugins, extensions or hacks.β30Updated 11 years ago
- using XPDF, pdftojson extracts text from PDF files as JSON, including word bounding boxes.β145Updated last year
- Cocoa editor for creating commit messagesβ185Updated 2 months ago
- Firefox Reader View as a command line toolβ883Updated 3 weeks ago
- Indelible linksβ466Updated last week
- Install and debug iPhone apps from the command line, without using Xcodeβ80Updated 3 years ago
- Simple JSON based geolocation API, powered by Google App Engine.β106Updated 12 years ago
- Export Apple Notes to SQLiteβ207Updated last year