arquivo/pwa-technologies

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/arquivo/pwa-technologies)

arquivo / pwa-technologies

Arquivo.pt main goal is the preservation and access of web contents that are no longer available online. During the developing of the PWA IR (information retrieval) system we faced limitations in searching speed, quality of results, scalability and usability. To cope with this, we modified the archive-access project (http://archive-access.sourc…

☆52

Alternatives and similar repositories for pwa-technologies

Users that are interested in pwa-technologies are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

anjackson / sliver
View on GitHub
A tool for collection archival slivers of the web and web archives
☆19Jun 1, 2026Updated 2 months ago
nla / outbackcdx
View on GitHub
Web archive index server based on RocksDB
☆43Jul 9, 2026Updated 3 weeks ago
ukwa / ukwa-pywb
View on GitHub
☆11Nov 21, 2025Updated 8 months ago
web-archive-group / hackathon
View on GitHub
☆14Feb 28, 2017Updated 9 years ago
webrecorder / public-web-archives
View on GitHub
A listing of world wide web archives, for humans and machines using Web Archive Manifest (WAM) yaml format
☆55Dec 5, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mementoweb / py-memento-client
View on GitHub
A Memento Client Library in Python
☆27Mar 5, 2018Updated 8 years ago
iipc / jwarc
View on GitHub
Java library for reading and writing WARC files with a typed API
☆60Jun 27, 2026Updated last month
vphill / web-archiving-course
View on GitHub
Web Archiving Course
☆23Mar 4, 2024Updated 2 years ago
netarchivesuite / solrwayback
View on GitHub
A search interface and wayback machine for the UKWA Solr based warc-indexer framework.
☆145Jul 23, 2026Updated last week
webrecorder / cdxj-indexer
View on GitHub
CDXJ Indexing of WARC/ARCs
☆35May 11, 2026Updated 2 months ago
centraldedados / parlamento
View on GitHub
🏛🇵🇹 Dados Abertos da Assembleia da República
☆17Jun 8, 2019Updated 7 years ago
ruby-microservices / noid
View on GitHub
Nice Opaque Identifier
☆16Sep 21, 2023Updated 2 years ago
webrecorder / pywb-remote-browsers
View on GitHub
Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives
☆16Jun 10, 2021Updated 5 years ago
summa-platform / summa-oss
View on GitHub
Meta-repository for the open-source version of the SUMMA Platform
☆16Mar 25, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
netarchivesuite / netarchivesuite
View on GitHub
Netarchivesuite development
☆23May 15, 2026Updated 2 months ago
hackla-engage / engage-client
View on GitHub
☆13Dec 7, 2022Updated 3 years ago
iipc / warc2html
View on GitHub
Converts WARC files to static HTML
☆59Sep 18, 2025Updated 10 months ago
plummerfernandez / Decoy-Browsing
View on GitHub
Automated browsing for Amazon, Google and Facebook
☆10Jan 8, 2016Updated 10 years ago
codeclou-archive / docker-nodejs-chrome-xvfb
View on GitHub
docker image to build node.js based projects and to able to run headless chrome
☆11Sep 2, 2019Updated 6 years ago
JamesCoyle / HistoryExtension
View on GitHub
☆15Oct 26, 2022Updated 3 years ago
bsdphk / AardWARC
View on GitHub
Museum-quality bit-archive storage management
☆11Mar 25, 2026Updated 4 months ago
machawk1 / awesome-memento
View on GitHub
A list of things related to software, literature, and other content for 🕣 Memento
☆121May 22, 2026Updated 2 months ago
sergiotapia / ekeko
View on GitHub
Ekeko is a tool that helps you save all of your favorited memes, videos and other online resources.
☆15Oct 27, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
vinaygoel / ars-workshop
View on GitHub
Archive Research Services Workshop
☆31Sep 29, 2017Updated 8 years ago
chfoo / huhhttp
View on GitHub
An evil web server.
☆13May 9, 2015Updated 11 years ago
NationalLibraryOfNorway / warchaeology
View on GitHub
Command line tool for digging into WARC files
☆50Updated this week
webrecorder / browsertrix-old
View on GitHub
Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System
☆87Feb 16, 2021Updated 5 years ago
Rhizome-Conifer / conifer-deploy
View on GitHub
Conifer setup and deployment via Ansible
☆12Jun 15, 2020Updated 6 years ago
webrecorder / py-wacz
View on GitHub
☆61Apr 11, 2024Updated 2 years ago
phonedude / cs595-s21
View on GitHub
CS 495/595 Web Security
☆10Feb 27, 2022Updated 4 years ago
LarryLuTW / facebook-poster
View on GitHub
Facebook-Poster is an api that automate post functionalities on facebook.
☆11Jun 16, 2025Updated last year
ktemkin / usb-hacking-logos
View on GitHub
public domain usb hacking logos
☆12Sep 23, 2021Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
helgeho / ArchiveSpark
View on GitHub
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…
☆161Oct 8, 2025Updated 9 months ago
cdbeland / moss
View on GitHub
Searching for misspelling, bad grammar, and violations of the Manual of Style in Wikipedia
☆13May 28, 2026Updated 2 months ago
lostatc / indiefeed.link
View on GitHub
A landing page for web feeds
☆12Sep 17, 2024Updated last year
whyrusleeping / ipfs-counter
View on GitHub
A tool to scrape the ipfs network for information on the number of peers in the network.
☆21Mar 22, 2024Updated 2 years ago
steffenfritz / html2warc
View on GitHub
simple script to convert web resources to a single warc file
☆24May 11, 2023Updated 3 years ago
TaylorJadin / site-archiving-toolkit
View on GitHub
☆10Updated this week
jin530 / MelBERT
View on GitHub
This is official code for the NAACL 2021 paper: "MelBERT: Metaphor Detection via Contextualized Late Interaction usingMetaphorical Identi…
☆52Feb 16, 2023Updated 3 years ago