datatogether/research

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/datatogether/research)

datatogether / research

📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity

☆100

Alternatives and similar repositories for research

Users that are interested in research are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

webrecorder / warcit
View on GitHub
Convert Directories, Files and ZIP Files to Web Archives (WARC)
☆99Apr 22, 2025Updated last year
datatogether / webapp
View on GitHub
Web application to allow users to add content metadata about crawled resources
☆13Feb 15, 2018Updated 8 years ago
edgi-govdata-archiving / web-monitoring-processing
View on GitHub
Tools for access, "diff"-ing, and analyzing archived web pages
☆23Jul 1, 2026Updated 2 weeks ago
aurelg / linkbak
View on GitHub
linkbak is a web page archiver : it reads a list of links and dumps the corresponding pages in HTML and PDF.
☆13Dec 8, 2022Updated 3 years ago
oduwsdl / Reconstructive
View on GitHub
A ServiceWorker for client-side reconstruction of composite mementos
☆15Mar 6, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ArchiveBox / pip-archivebox
View on GitHub
Official Python package for ArchiveBox, the self-hosted internet archiving solution.
☆12Oct 5, 2024Updated last year
N0taN3rd / Squidwarc
View on GitHub
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
☆178May 19, 2020Updated 6 years ago
iipc / openwayback
View on GitHub
The OpenWayback Development
☆522Jan 3, 2024Updated 2 years ago
joehand / hyperarchiver
View on GitHub
Host, backup, and share hyperdrive archives
☆13Aug 22, 2017Updated 8 years ago
internetarchive / analyze_ocr
View on GitHub
Parse OCR result files for pagenos, tables of contents, etc.
☆14Nov 30, 2011Updated 14 years ago
internetarchive / brozzler
View on GitHub
brozzler - distributed browser-based web crawler
☆809Jul 7, 2026Updated 2 weeks ago
hypercore-protocol / hyperdrive-schemas
View on GitHub
Protobuf/gRPC schemas for the Hyperdrive API
☆14Jul 14, 2020Updated 6 years ago
harvard-lil / warcgames
View on GitHub
Hacking challenges to learn web archive security.
☆35Jun 23, 2017Updated 9 years ago
qri-io / rfcs
View on GitHub
Request For Comments (RFCs) documenting changes to Qri
☆12Nov 23, 2021Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
edgi-govdata-archiving / web-monitoring-ui
View on GitHub
UI to enable analysts to quickly assess changes to monitored government websites
☆40Jul 1, 2026Updated 2 weeks ago
DanePubliczneGovPl / ckanext-danepubliczne
View on GitHub
Layout and custom fields for DanePubliczne.gov.pl
☆10Dec 7, 2022Updated 3 years ago
DocNow / waybackprov
View on GitHub
utility to fetch provenance information from Internet Archive's Wayback Machine
☆15Feb 5, 2026Updated 5 months ago
machawk1 / awesome-memento
View on GitHub
A list of things related to software, literature, and other content for 🕣 Memento
☆121May 22, 2026Updated last month
whyrusleeping / ipfs-counter
View on GitHub
A tool to scrape the ipfs network for information on the number of peers in the network.
☆21Mar 22, 2024Updated 2 years ago
machawk1 / wail
View on GitHub
Web Archiving Integration Layer: One-Click User Instigated Preservation
☆398Jun 19, 2026Updated last month
RvanVeenendaal / Spreadsheet-Complexity-Analyser
View on GitHub
This software (prototype) extracts values of Excel spreadsheet properties and calculates a tentative spreadsheet complexity assessment ba…
☆13May 15, 2026Updated 2 months ago
qri-io / 2017-frontend
View on GitHub
qri electron & web frontend
☆23Aug 10, 2021Updated 4 years ago
datatogether / sentry
View on GitHub
Parallelized web crawler written in Golang
☆15Oct 2, 2018Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Famicoman / ia-ul-from-youtubedl
View on GitHub
Uploads items into the Internet Archive after they have been downloaded with youtube-dl
☆15Feb 28, 2015Updated 11 years ago
jorhett / scrivener-htmlbook
View on GitHub
How to use Scrivener to write HTMLBook
☆17Jun 15, 2021Updated 5 years ago
DigitalTransgenderArchive / homosaurus_site
View on GitHub
The public display of the homosaurus vocabulary.
☆12Jun 30, 2026Updated 3 weeks ago
joehand / dat-download
View on GitHub
simple dat downloading module
☆10Jan 11, 2023Updated 3 years ago
RetroRodent / tumblrfollows
View on GitHub
Basic python script to list following and followed blogs on Tumblr
☆20Oct 16, 2014Updated 11 years ago
mdlincoln / ulanr
View on GitHub
Reconcile artist names to the Getty Union List of Artist Names
☆20Oct 10, 2016Updated 9 years ago
RockefellerArchiveCenter / project_electron
View on GitHub
Documentation for Project Electron
☆14Dec 2, 2024Updated last year
dat-ecosystem-archive / dat-encoding
View on GitHub
Dat's way of encoding and decoding dat links [ DEPRECATED - see https://github.com/mafintosh/abstract-encoding and https://github.com/com…
☆19Jan 6, 2022Updated 4 years ago
qri-io / dataset
View on GitHub
qri dataset definition
☆15Sep 24, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Isicson / Connexion-Macro-Homosaurus
View on GitHub
A macro created for OCLC Connexion for adding Homosaurus terms to bibliographic records.
☆14May 1, 2025Updated last year
samvera / iiif_manifest
View on GitHub
☆12Jul 13, 2026Updated last week
IQSS / dataverse.harvard.edu
View on GitHub
Custom code for dataverse.harvard.edu and an issue tracker for the IQSS Dataverse team's operational work, for better tracking on https:/…
☆13Jul 1, 2026Updated 2 weeks ago
recrm / ArchiveTools
View on GitHub
A collection of tools for archiving and analysing the internet.
☆79Jul 6, 2022Updated 4 years ago
imanzarrabian / StarsToRain
View on GitHub
Simple Python script that exports all Github Stars for a given user into an HTML file importable by Raindrop.io
☆16Nov 12, 2020Updated 5 years ago
oduwsdl / MemGator
View on GitHub
A Memento Aggregator CLI and Server in Go
☆80Apr 9, 2026Updated 3 months ago
digst / DCAT-AP-DK
View on GitHub
DCAT-AP-DK er en dansk anvendelsesprofil til beskrivelse af datasæt og datakataloger
☆10Updated this week