internetarchive / SparklingLinks
Internet Archive's Sparkling Data Processing Library
☆15Updated this week
Alternatives and similar repositories for Sparkling
Users that are interested in Sparkling are comparing it to the libraries listed below
Sorting:
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.☆134Updated last week
- WASAPI data transfer APIs☆48Updated 3 years ago
- ☆16Updated 9 months ago
- Automating description for Web Archives in ArchivesSpace using the Archive-It CDX and Partner Data APIs☆11Updated 7 years ago
- Command line tool for digging into WARC files☆50Updated last week
- Web application for distributed compute analysis of Archive-It web archive collections.☆20Updated 4 months ago
- Identify, review, and remove sensitive files☆30Updated 2 years ago
- The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.☆152Updated 2 months ago
- A tool for creating and managing Mailbags, a package for preserving email using multiple preservation formats☆50Updated 2 months ago
- Prototype SOLR-powered web archive exploration UI.☆43Updated 5 years ago
- ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes…☆126Updated 2 months ago
- Siegfried-based characterization tool for directories and disk images☆89Updated 2 months ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Updated 6 years ago
- Archive Research Services Workshop☆31Updated 8 years ago
- Archipelago Commons Docker Deployment Repository☆34Updated last month
- A command line utility for listing and searching snapshots in web archives☆17Updated 2 years ago
- The Oxford Common File Layout (OCFL) specifications and website☆64Updated last week
- Documentation for Project Electron☆14Updated last year
- ANNotation Infrastructure using Finna: an automatic subject indexing tool using Finna as corpus☆15Updated 7 years ago
- A Github Action for turning Markdown into ReSpec HTML☆15Updated last year
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆132Updated 2 months ago
- ☆35Updated 2 years ago
- Validator for the Presentation API☆47Updated 2 weeks ago
- A command line utility for converting MARC to CSV (and Parquet, etc)☆28Updated 8 months ago
- VIAF via Python☆13Updated 8 months ago
- An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…☆156Updated 4 months ago
- Repository for the book Among Digitized Manuscripts by L.W. Cornelis van Lit (Leiden: Brill, 2020)☆25Updated 5 years ago
- OCFL tools in Python☆25Updated 5 months ago
- A CLI for OCFL repositories☆19Updated 2 weeks ago
- This repository has migrated to:☆100Updated 3 months ago