arquivo / pwa-technologiesLinks
Arquivo.pt main goal is the preservation and access of web contents that are no longer available online. During the developing of the PWA IR (information retrieval) system we faced limitations in searching speed, quality of results, scalability and usability. To cope with this, we modified the archive-access project (http://archive-access.sourc…
☆48Updated 2 weeks ago
Alternatives and similar repositories for pwa-technologies
Users that are interested in pwa-technologies are comparing it to the libraries listed below
Sorting:
- Converts WARC files to static HTML☆47Updated last year
- Command line tool for digging into WARC files☆45Updated 3 weeks ago
- A social media open post web archiving tool☆27Updated 2 months ago
- Centralised repository for WARC usage specifications.☆115Updated 8 months ago
- CDXJ Indexing of WARC/ARCs☆28Updated 8 months ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆35Updated 3 months ago
- A tool for collection archival slivers of the web and web archives☆14Updated 5 months ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆46Updated 2 weeks ago
- Web archive index server based on RocksDB☆34Updated last month
- wabac.js - Web Archive Browsing Augmentation Client☆113Updated this week
- Sort-friendly URI Reordering Transform (SURT) python module☆42Updated last year
- Nondestructive warc-in-tar to warc conversion☆27Updated 12 years ago
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆128Updated 3 weeks ago
- A Memento Aggregator CLI and Server in Go☆67Updated 5 months ago
- A Rails engine supporting the discovery of web archives.☆50Updated 2 years ago
- Webrecorder Automated In-Page Behavior Framework☆13Updated 4 years ago
- ☆11Updated last year
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- ☆27Updated 2 years ago
- Static Site Generator for Viewing Web Archives (in WACZ) format☆27Updated 2 years ago
- Fast PDF generation and compression. Deals with millions of pages daily.☆118Updated last month
- search interface for scholarly works☆86Updated last year
- ☆52Updated last year
- JavaScript module and CLI tool for working with web archive data using the WACZ format specification.☆16Updated 5 months ago
- Create and edit WARC and WACZ files☆13Updated 8 months ago
- Specifications developed and maintained by the Webrecorder community.☆134Updated 7 months ago
- Perpetual Access To The Scholarly Record☆120Updated last year
- A Github Action for turning Markdown into ReSpec HTML☆14Updated last year
- (Experimental) High-fidelity capture of Twitter threads as sealed PDFs.☆54Updated last year
- A search interface and wayback machine for the UKWA Solr based warc-indexer framework.☆130Updated 2 weeks ago