helgeho / ArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
148Updated last month

Alternatives and similar repositories for ArchiveSpark:

Users that are interested in ArchiveSpark are comparing it to the libraries listed below