webrecorder / browsertrixLinks
Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
β380Updated this week
Alternatives and similar repositories for browsertrix
Users that are interested in browsertrix are comparing it to the libraries listed below
Sorting:
- Run a high-fidelity browser-based web archiving crawler in a single Docker containerβ948Updated last week
- π¨ High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.β185Updated 4 months ago
- Specifications developed and maintained by the Webrecorder community.β138Updated 3 months ago
- wabac.js - Web Archive Browsing Augmentation Clientβ119Updated last month
- Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.β405Updated 8 months ago
- Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewingβ¦β106Updated 2 months ago
- Serverless replay of web archives directly in the browserβ895Updated last month
- Converts WARC files to static HTMLβ49Updated 3 months ago
- Core Python Web Archiving Toolkit for replay and recording of web archivesβ1,600Updated last month
- β55Updated last year
- Web Archiving Integration Layer: One-Click User Instigated Preservationβ385Updated 10 months ago
- Indelible linksβ492Updated 3 weeks ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.β55Updated last month
- Command line tool for digging into WARC filesβ50Updated last week
- π A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivityβ97Updated 7 years ago
- A list of things related to software, literature, and other content for π£ Mementoβ104Updated last week
- A Memento Aggregator CLI and Server in Goβ73Updated 10 months ago
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.β134Updated this week
- A tool for detecting viruses and NSFW material in WARC filesβ17Updated last month
- Web archive index server based on RocksDBβ37Updated 2 weeks ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.β39Updated last month
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in tβ¦β132Updated last month
- Chrome extension to "Create WARC files from any webpage"β227Updated last month
- Centralised repository for WARC usage specifications.β120Updated 3 months ago
- (Experimental) High-fidelity capture of Twitter threads as sealed PDFs.β55Updated 2 years ago
- Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)β445Updated 5 years ago
- Command line tool to convert a file in the WARC format to a file in the ZIM formatβ75Updated last month
- brozzler - distributed browser-based web crawlerβ768Updated this week
- Convert Directories, Files and ZIP Files to Web Archives (WARC)β91Updated 8 months ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)β168Updated 4 months ago