ArchiveBox / readability-extractor
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
☆38Updated 4 months ago
Alternatives and similar repositories for readability-extractor:
Users that are interested in readability-extractor are comparing it to the libraries listed below
- Clean a series of links, resolving redirects and finding Wayback results if page is gone. Originally written to aid with importing from A…☆16Updated 3 months ago
- DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by Arch…☆18Updated 11 months ago
- something like a public wiki, - a place to store notes, ideas, blogposts, photography, or writing☆18Updated last week
- Export your Github activity: events, repositories, stars, etc.☆48Updated last year
- [Moved to https://github.com/standardnotes/app] A code editor for Standard Notes with syntax highlighting support for over 120 programmin…☆13Updated 2 years ago
- Personal news feed: search for results on Reddit/Pinboard/Twitter/Hackernews and read as RSS☆30Updated 4 months ago
- Extend Firefox's history capabilities with browsing stats, improved searching and additional features☆24Updated last year
- Proxies third-party PDF files and HTML pages with the Hypothesis client embedded, so you can annotate them☆21Updated this week
- Encapsulate dom-anchor-text-quote and dom-anchor-text-position for use in browser scripts☆10Updated 3 years ago
- Collaborative cheatsheets for console commands (tldr project) now in your Browser!☆14Updated 3 years ago
- Import data from Google Takeout to search and analyze☆16Updated 2 years ago
- linkbak is a web page archiver : it reads a list of links and dumps the corresponding pages in HTML and PDF.☆14Updated 2 years ago
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.☆27Updated 3 months ago
- Official Python package for ArchiveBox, the self-hosted internet archiving solution.☆13Updated 3 months ago
- A modified version of searx (the privacy-respecting metasearch engine) to only search an allowlist of sites, to build functionality simil…☆19Updated 3 years ago
- Awesome links related to RSS, ATOM, and Syndication formats.☆50Updated 5 months ago
- Where knowledge grows.☆13Updated 2 months ago
- rsstodolist Firefox and Chrome addon (using Web Extension API)☆13Updated last year
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing…☆55Updated this week
- An automatic JSON API for HPI☆15Updated 2 months ago
- Derived from https://github.com/telerik/kendo-ui-core☆12Updated last year
- Uroute: Route URLs to configured browsers☆31Updated 2 years ago
- Daemon which connects to active mpv instances, saving a history of what I watch/listen to☆13Updated 2 months ago
- Encode/decode binary data over a live streaming video in real time.☆13Updated last year
- Source for the Github Wiki / ReadTheDocs documentation for AchiveBox, the self-hosted internet archiving solution.☆14Updated last month
- Exports all accessible reddit comments for an account using pushshift☆11Updated 2 months ago
- TagSpaces Web Clipper for Chrome and Firefox☆40Updated 2 weeks ago
- The ArchiveWeb.page Site☆27Updated last month
- Search, download, convert and send files directly to your kindle from Libgen in one place.☆23Updated 2 years ago
- Centralize, view, edit, label and organize collections of your favorite URLs 🔗 📙☆38Updated 2 years ago