arquivo / pwa-technologiesLinks
Arquivo.pt main goal is the preservation and access of web contents that are no longer available online. During the developing of the PWA IR (information retrieval) system we faced limitations in searching speed, quality of results, scalability and usability. To cope with this, we modified the archive-access project (http://archive-access.sourc…
☆51Updated 3 months ago
Alternatives and similar repositories for pwa-technologies
Users that are interested in pwa-technologies are comparing it to the libraries listed below
Sorting:
- Converts WARC files to static HTML☆49Updated last month
- Centralised repository for WARC usage specifications.☆118Updated 3 weeks ago
- Command line tool for digging into WARC files☆47Updated 2 weeks ago
- A tool for collection archival slivers of the web and web archives☆16Updated 8 months ago
- Webrecorder Automated In-Page Behavior Framework☆13Updated 4 years ago
- search interface for scholarly works☆84Updated last year
- CDXJ Indexing of WARC/ARCs☆29Updated 11 months ago
- A Rails engine supporting the discovery of web archives.☆50Updated 2 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆51Updated last week
- Web archive index server based on RocksDB☆36Updated last week
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆38Updated 6 months ago
- ☆54Updated last year
- A social media open post web archiving tool☆27Updated last month
- wabac.js - Web Archive Browsing Augmentation Client☆114Updated last week
- Comparing warc files☆17Updated 6 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- A Memento Aggregator CLI and Server in Go☆70Updated 8 months ago
- Perpetual Access To The Scholarly Record☆119Updated last year
- A set of utilities for processing MediaWiki XML dump data.☆57Updated 8 months ago
- The repo for the PetScan tool☆57Updated 3 weeks ago
- Nondestructive warc-in-tar to warc conversion☆27Updated 12 years ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆72Updated 7 months ago
- ☆27Updated 3 years ago
- A listing of world wide web archives, for humans and machines using Web Archive Manifest (WAM) yaml format☆53Updated 2 years ago
- A command line tool to archive a git repository from GitHub to the Internet Archive.☆91Updated 4 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆44Updated last month
- React components to render differences between captures at the Wayback Machine☆35Updated 6 months ago
- Web hub based on Wikidata☆38Updated 2 years ago
- User contributed (non Google) OCR models for Tesseract☆29Updated 6 months ago
- Create and edit WARC and WACZ files☆17Updated 11 months ago