arquivo / pwa-technologiesLinks
Arquivo.pt main goal is the preservation and access of web contents that are no longer available online. During the developing of the PWA IR (information retrieval) system we faced limitations in searching speed, quality of results, scalability and usability. To cope with this, we modified the archive-access project (http://archive-access.sourc…
☆52Updated last month
Alternatives and similar repositories for pwa-technologies
Users that are interested in pwa-technologies are comparing it to the libraries listed below
Sorting:
- Converts WARC files to static HTML☆49Updated 2 months ago
- Centralised repository for WARC usage specifications.☆119Updated 2 months ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆53Updated 2 weeks ago
- A social media open post web archiving tool☆27Updated last week
- Command line tool for digging into WARC files☆49Updated this week
- A tool for collection archival slivers of the web and web archives☆16Updated 9 months ago
- search interface for scholarly works☆85Updated last year
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 8 years ago
- Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives☆16Updated 4 years ago
- CDXJ Indexing of WARC/ARCs☆31Updated last year
- A Rails engine supporting the discovery of web archives.☆50Updated 2 years ago
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆131Updated 3 weeks ago
- Comparing warc files☆17Updated 6 years ago
- Perpetual Access To The Scholarly Record☆120Updated last year
- Web archive index server based on RocksDB☆36Updated last month
- ☆55Updated last year
- A Memento Aggregator CLI and Server in Go☆71Updated 9 months ago
- The repo for the PetScan tool☆57Updated 2 months ago
- export data from twitter archive and visualize it☆25Updated 3 years ago
- Browser version of Hyphe (WIP)☆32Updated 6 months ago
- Sort-friendly URI Reordering Transform (SURT) python module☆44Updated 3 months ago
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆188Updated last year
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆55Updated 3 months ago
- A listing of world wide web archives, for humans and machines using Web Archive Manifest (WAM) yaml format☆52Updated 3 years ago
- Scraper for German democracy documents☆40Updated 2 years ago
- Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System☆87Updated 4 years ago
- freeyourstuff.cc - universal content liberation☆81Updated 2 years ago
- Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archive…☆26Updated 3 years ago
- wabac.js - Web Archive Browsing Augmentation Client☆116Updated last week
- Scraper for downloading the entire ebooks repository of project Gutenberg☆154Updated this week