INA-DLWeb / LiveArchivingProxy
An HTTP Proxy that archives all intercepted traffic.
☆21Updated 10 years ago
Alternatives and similar repositories for LiveArchivingProxy:
Users that are interested in LiveArchivingProxy are comparing it to the libraries listed below
- Saves proxied HTTP traffic to a WARC file.☆27Updated 11 years ago
- A dockerized, queued high fidelity web archiver based on Squidwarc☆58Updated 8 months ago
- Trough: Big data, small databases.☆40Updated 8 months ago
- Serving content from a WARC☆61Updated 12 years ago
- wabac.js - Web Archive Browsing Augmentation Client☆106Updated this week
- An experiment in creating a dump of your personal browser history for analysis☆33Updated 6 years ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆29Updated 2 weeks ago
- Convert Directories, Files and ZIP Files to Web Archives (WARC)☆85Updated 2 weeks ago
- Utilities for interpreting microformats2 data☆17Updated 2 years ago
- Downloads and imports Wikipedia page histories to a git repository☆34Updated 3 months ago
- A Memento Aggregator CLI and Server in Go☆62Updated 3 weeks ago
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆87Updated 4 years ago
- Datasette plugin providing an automatic GraphQL API for your SQLite databases☆102Updated 11 months ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆51Updated 6 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆41Updated 8 months ago
- Web archiving using Google Chrome☆44Updated 5 years ago
- ☆27Updated 2 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- fork-to-sign contributor licensing agreement for all pull-request-driven projects☆35Updated 2 years ago
- ☆24Updated 9 years ago
- A tool for collection archival slivers of the web and web archives☆13Updated last month
- Bundle external assets in a HTML file to distribute a stand-alone HTML document.☆30Updated 3 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆44Updated 7 years ago
- Test cases for validating BagIt implementations☆11Updated 2 years ago
- Webrecorder Automated In-Page Behavior Framework☆13Updated 3 years ago
- Scrapers for disaster data - writes to https://github.com/simonw/disaster-data☆49Updated last year
- Push notification adapter for feeds☆31Updated 2 weeks ago
- Add editing UI and other power-user features to Datasette.☆12Updated 2 years ago
- Allow URLs to point to any text piece in a document☆16Updated 7 years ago
- DocumentCloud's back end source code - Please report bugs, issues and feature requests to info@documentcloud.org☆37Updated last week