harvard-lil / scoopLinks
π¨ High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.
β177Updated last month
Alternatives and similar repositories for scoop
Users that are interested in scoop are comparing it to the libraries listed below
Sorting:
- (Experimental) High-fidelity capture of Twitter threads as sealed PDFs.β55Updated last year
- Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more β¦β342Updated last week
- wabac.js - Web Archive Browsing Augmentation Clientβ114Updated last week
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.β38Updated 5 months ago
- Specifications developed and maintained by the Webrecorder community.β136Updated this week
- Static Site Generator for Viewing Web Archives (in WACZ) formatβ28Updated 2 years ago
- Converts WARC files to static HTMLβ49Updated last month
- π§© Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser enβ¦β18Updated 3 months ago
- Create and edit WARC and WACZ filesβ17Updated 10 months ago
- A list of things related to software, literature, and other content for π£ Mementoβ100Updated last year
- A tool for collection archival slivers of the web and web archivesβ16Updated 8 months ago
- A simple Python wrapper and command-line interface for archive.orgβs "Save Page Now" capturing serviceβ185Updated last year
- Convert Directories, Files and ZIP Files to Web Archives (WARC)β89Updated 6 months ago
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.β131Updated 2 weeks ago
- A Memento Aggregator CLI and Server in Goβ68Updated 7 months ago
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewingβ¦β96Updated last week
- Web archive index server based on RocksDBβ35Updated last month
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in tβ¦β129Updated 2 months ago
- π A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivityβ97Updated 7 years ago
- A tool for detecting viruses and NSFW material in WARC filesβ17Updated last year
- Command line tool for digging into WARC filesβ46Updated 3 weeks ago
- Chrome extension to "Create WARC files from any webpage"β223Updated last year
- JavaScript module and CLI tool for working with web archive data using the WACZ format specification.β16Updated 7 months ago
- β53Updated last year
- A command line utility for listing and searching snapshots in web archivesβ17Updated last year
- searchmysite.net is an open source search engine and search as a serviceβ134Updated 2 weeks ago
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user acβ¦β55Updated last month
- Tool to index and serve HTML files. Powered by Datasette.β107Updated 3 years ago
- Centralised repository for WARC usage specifications.β117Updated last week
- Find possible host names in a source textβ53Updated 2 years ago