peterk/warcworker

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/peterk/warcworker)

peterk / warcworker

A dockerized, queued high fidelity web archiver based on Squidwarc

☆62

Alternatives and similar repositories for warcworker

Users that are interested in warcworker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

N0taN3rd / Squidwarc
View on GitHub
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
☆178May 19, 2020Updated 6 years ago
webrecorder / cdxj-indexer
View on GitHub
CDXJ Indexing of WARC/ARCs
☆35May 11, 2026Updated 2 months ago
helgeho / Web2Warc
View on GitHub
An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)
☆26Oct 9, 2017Updated 8 years ago
PromyLOPh / crocoite
View on GitHub
Web archiving using Google Chrome
☆45Dec 30, 2019Updated 6 years ago
ukwa / ukwa-pywb
View on GitHub
☆11Nov 21, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
unt-libraries / py-wasapi-client
View on GitHub
A client for the Archive-It And Webrecorder WASAPI Data Transfer API
☆16Oct 18, 2019Updated 6 years ago
eugeneware / warc
View on GitHub
Parse WARC (Web Archive Files) as a node.js stream
☆23Oct 20, 2014Updated 11 years ago
peterk / munin-indexer
View on GitHub
A social media open post web archiving tool
☆26Feb 4, 2026Updated 5 months ago
harvard-lil / js-wacz
View on GitHub
JavaScript module and CLI tool for working with web archive data using the WACZ format specification.
☆17Mar 11, 2025Updated last year
oduwsdl / Reconstructive
View on GitHub
A ServiceWorker for client-side reconstruction of composite mementos
☆15Mar 6, 2025Updated last year
webis-de / wasp
View on GitHub
☆28Jun 30, 2026Updated 3 weeks ago
ukwa / webarchive-discovery
View on GitHub
Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…
☆133Nov 21, 2025Updated 8 months ago
anjackson / sliver
View on GitHub
A tool for collection archival slivers of the web and web archives
☆19Jun 1, 2026Updated last month
webrecorder / webrecorder-player
View on GitHub
Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)
☆445Sep 17, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
NCSU-Libraries / ocracoke
View on GitHub
Rails application supporting the creation of OCR and the IIIF Content Search API
☆33Dec 14, 2022Updated 3 years ago
harvard-lil / waczerciser
View on GitHub
Create and edit WARC and WACZ files
☆29Dec 6, 2024Updated last year
machawk1 / wail
View on GitHub
Web Archiving Integration Layer: One-Click User Instigated Preservation
☆398Jun 19, 2026Updated last month
harvard-lil / thread-keeper
View on GitHub
(Experimental) High-fidelity capture of Twitter threads as sealed PDFs.
☆55Dec 4, 2023Updated 2 years ago
web-archive-group / heritrix-walkthrough
View on GitHub
☆10Jun 10, 2016Updated 10 years ago
N0taN3rd / simplechrome
View on GitHub
Webrecorders DevTools Protocol Automation Library
☆18Oct 18, 2022Updated 3 years ago
NationalLibraryOfNorway / warchaeology
View on GitHub
Command line tool for digging into WARC files
☆50Updated this week
webrecorder / browsertrix-old
View on GitHub
Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System
☆87Feb 16, 2021Updated 5 years ago
bsdphk / AardWARC
View on GitHub
Museum-quality bit-archive storage management
☆11Mar 25, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
vphill / web-archiving-course
View on GitHub
Web Archiving Course
☆23Mar 4, 2024Updated 2 years ago
DocNow / waybackprov
View on GitHub
utility to fetch provenance information from Internet Archive's Wayback Machine
☆15Feb 5, 2026Updated 5 months ago
webrecorder / warcit
View on GitHub
Convert Directories, Files and ZIP Files to Web Archives (WARC)
☆99Apr 22, 2025Updated last year
archivesunleashed / aut
View on GitHub
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
☆158Dec 5, 2025Updated 7 months ago
asmecher / hypothesis
View on GitHub
A Hypothes.is integration plugin for OJS
☆12Mar 17, 2025Updated last year
Historypin / community-cloud-storage
View on GitHub
decentralized storage layer for community archives
☆15Dec 18, 2025Updated 7 months ago
nlnwa / gowarcserver
View on GitHub
☆17Mar 31, 2025Updated last year
kegashe / obsidian-css-snippets
View on GitHub
Obsidian CSS snippets to tweak UI and harmonize various plugins.
☆11Jul 18, 2024Updated 2 years ago
glenrobson / iiif_stuff
View on GitHub
IIIF Examples and useful code
☆20Sep 10, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
openaire / guidelines-literature-repositories
View on GitHub
OpenAIRE Guidelines for Literature Repository Managers based on Dublin Core and DataCite Metadata Kernel
☆16Jun 4, 2026Updated last month
nla / outbackcdx
View on GitHub
Web archive index server based on RocksDB
☆43Jul 9, 2026Updated last week
netarchivesuite / solrwayback
View on GitHub
A search interface and wayback machine for the UKWA Solr based warc-indexer framework.
☆145Jul 13, 2026Updated last week
allmaps / viewer
View on GitHub
☆13Mar 1, 2023Updated 3 years ago
slyrz / warc
View on GitHub
Read and write WARC files in Go
☆50Apr 9, 2018Updated 8 years ago
UAlbanyArchives / describingWebArchives
View on GitHub
Automating description for Web Archives in ArchivesSpace using the Archive-It CDX and Partner Data APIs
☆11Aug 10, 2018Updated 7 years ago
N0taN3rd / wail
View on GitHub
One-Click User Instigated Preservation
☆128Feb 3, 2019Updated 7 years ago