internetarchive/trough

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/internetarchive/trough)

internetarchive / trough

Trough: Big data, small databases.

☆43

Alternatives and similar repositories for trough

Users that are interested in trough are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

internetarchive / umbra
View on GitHub
A queue-controlled browser automation tool for improving web crawl quality
☆68May 28, 2026Updated last month
webrecorder / har2warc
View on GitHub
Convert HTTP Archive (HAR) -> Web Archive (WARC) format
☆55Oct 21, 2018Updated 7 years ago
oduwsdl / MemGator
View on GitHub
A Memento Aggregator CLI and Server in Go
☆80Apr 9, 2026Updated 3 months ago
internetarchive / sandcrawler
View on GitHub
Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki
☆28Jul 31, 2024Updated last year
ablwr / media-collection-viewer
View on GitHub
visualizations/charts for media collections, based on mediainfo
☆14Sep 15, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
anjackson / sliver
View on GitHub
A tool for collection archival slivers of the web and web archives
☆19Jun 1, 2026Updated last month
ocfl-archive / gocfl
View on GitHub
Go OCFL Implementation
☆18Jun 24, 2026Updated 3 weeks ago
harvard-lil / gitspoke
View on GitHub
Download GitHub repositories
☆13May 10, 2025Updated last year
richardlehane / webarchive
View on GitHub
golang readers for ARC and WARC webarchive formats
☆20Apr 3, 2023Updated 3 years ago
multimediamike / MobyCAIRO
View on GitHub
Computer-assisted image straightening and cropping
☆30Aug 7, 2022Updated 3 years ago
internetarchive / brozzler
View on GitHub
brozzler - distributed browser-based web crawler
☆809Jul 7, 2026Updated last week
kjaymiller / diversity-in-neurodiversity
View on GitHub
Resources for those underrepresented folks being diagnosed
☆12Feb 10, 2022Updated 4 years ago
vinaygoel / ars-workshop
View on GitHub
Archive Research Services Workshop
☆31Sep 29, 2017Updated 8 years ago
alexandrevilain / postgrest-auth
View on GitHub
Easily add authentication to your postgrest API
☆25Feb 14, 2019Updated 7 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
mekarpeles / math.mx
View on GitHub
A comprehensive graph of mathematical domains and topics
☆24Jan 8, 2022Updated 4 years ago
iipc / warc-specifications
View on GitHub
Centralised repository for WARC usage specifications.
☆129Apr 4, 2026Updated 3 months ago
caraesten / telematic
View on GitHub
Connect to telnet libraries over serial with ease!
☆10Feb 28, 2021Updated 5 years ago
cldellow / datasette-ui-extras
View on GitHub
Add editing UI and other power-user features to Datasette.
☆14Mar 4, 2023Updated 3 years ago
webrecorder / specs
View on GitHub
Specifications developed and maintained by the Webrecorder community.
☆142Oct 16, 2025Updated 9 months ago
IISH / oai4solr
View on GitHub
OAI-PMH plugin for Solr
☆23May 12, 2021Updated 5 years ago
slifty / tvarchive-duplitron
View on GitHub
☆29Nov 28, 2016Updated 9 years ago
k-int / gokb-phase1
View on GitHub
Original GOKb repo - Moving to https://github.com/openlibraryenvironment/gokb
☆11Jan 23, 2018Updated 8 years ago
slub / lod-explorativ
View on GitHub
lod-explorativ is a prototype of a Svelte webapp which let you explore bibliographic resources from a topic's point of view.
☆15Jan 19, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zimeon / ocfl-py
View on GitHub
OCFL tools in Python
☆25Jun 26, 2026Updated 3 weeks ago
internetarchive / warcprox
View on GitHub
WARC writing MITM HTTP/S proxy
☆456Jun 17, 2026Updated last month
natliblux / warc-safe
View on GitHub
A tool for detecting viruses and NSFW material in WARC files
☆18Updated this week
internetarchive / wayback-diff
View on GitHub
React components to render differences between captures at the Wayback Machine
☆43Jul 6, 2026Updated 2 weeks ago
internetarchive / arch
View on GitHub
Web application for distributed compute analysis of Archive-It web archive collections.
☆20Mar 24, 2026Updated 3 months ago
internetarchive / dweb-transports
View on GitHub
☆28Jul 18, 2023Updated 3 years ago
stevenferrer / multi-select-facet
View on GitHub
An example of multi-select facet with Solr, Vue and Go
☆35Mar 11, 2023Updated 3 years ago
internetarchive / pdf_trio
View on GitHub
A PDF classifier ensemble with REST API service
☆23Mar 5, 2021Updated 5 years ago
RichardLitt / Quick-tips-for-making-your-software-outlive-your-job
View on GitHub
The paper repository for "10 quick tips for making your software outlive your job"
☆20Oct 28, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
TeamHG-Memex / sitehound-frontend
View on GitHub
Site Hound (previously THH) is a Domain Discovery Tool
☆24Apr 8, 2026Updated 3 months ago
WebarchivCZ / Seeder
View on GitHub
Seeder - Czech webarchive curating tool and public site
☆17Feb 12, 2026Updated 5 months ago
internetarchive / Sparkling
View on GitHub
Internet Archive's Sparkling Data Processing Library
☆17May 4, 2026Updated 2 months ago
internetarchive / warctools
View on GitHub
Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
☆176Aug 18, 2025Updated 11 months ago
JohnMarkOckerbloom / ftl
View on GitHub
Forward to Libraries service (selected code and data)
☆20Updated this week
britishlibrary / mpt
View on GitHub
A utility for staging files, calculating and validating file checksums, and comparing checksum values between storage locations.
☆14Jul 10, 2023Updated 3 years ago
qjerome / magic-rs
View on GitHub
Safe Rust implementation of libmagic
☆34Jun 11, 2026Updated last month