arquivo / pwa-technologiesLinks
Arquivo.pt main goal is the preservation and access of web contents that are no longer available online. During the developing of the PWA IR (information retrieval) system we faced limitations in searching speed, quality of results, scalability and usability. To cope with this, we modified the archive-access project (http://archive-access.sourc…
☆52Updated last month
Alternatives and similar repositories for pwa-technologies
Users that are interested in pwa-technologies are comparing it to the libraries listed below
Sorting:
- Converts WARC files to static HTML☆49Updated 3 months ago
- Centralised repository for WARC usage specifications.☆120Updated 2 months ago
- Command line tool for digging into WARC files☆49Updated 2 weeks ago
- Webrecorder Automated In-Page Behavior Framework☆13Updated 4 years ago
- CDXJ Indexing of WARC/ARCs☆31Updated last year
- A tool for collection archival slivers of the web and web archives☆16Updated 10 months ago
- A Rails engine supporting the discovery of web archives.☆50Updated 2 years ago
- A Memento Aggregator CLI and Server in Go☆73Updated 10 months ago
- A social media open post web archiving tool☆27Updated last month
- wabac.js - Web Archive Browsing Augmentation Client☆117Updated 3 weeks ago
- A command line utility for listing and searching snapshots in web archives☆17Updated 2 years ago
- Comparing warc files☆17Updated 6 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆54Updated last month
- ☆55Updated last year
- Nondestructive warc-in-tar to warc conversion☆27Updated 12 years ago
- A Github Action for turning Markdown into ReSpec HTML☆14Updated last year
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆39Updated last month
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆47Updated 8 years ago
- A listing of world wide web archives, for humans and machines using Web Archive Manifest (WAM) yaml format☆52Updated 3 years ago
- Fast PDF generation and compression. Deals with millions of pages daily.☆130Updated 3 months ago
- The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.☆151Updated last month
- Web archive index server based on RocksDB☆37Updated this week
- The repo for the PetScan tool☆57Updated 2 months ago
- search interface for scholarly works☆85Updated last year
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆132Updated last month
- Searchable Linkable Open Public Indexed (SLOPI) Communication☆21Updated 2 years ago
- export data from twitter archive and visualize it☆25Updated 3 years ago
- Static Site Generator for Viewing Web Archives (in WACZ) format☆29Updated 2 years ago
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆188Updated last year
- A command line tool to archive a git repository from GitHub to the Internet Archive.☆92Updated 4 years ago