WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.
☆269Feb 11, 2025Updated last year
Alternatives and similar repositories for warc-gpt
Users that are interested in warc-gpt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Command line tool for digging into WARC files☆51Updated this week
- A Lit web-component for viewing a Whisper JSON transcript file☆14Feb 12, 2026Updated 2 months ago
- ☆17Mar 31, 2025Updated last year
- Create and edit WARC and WACZ files☆25Dec 6, 2024Updated last year
- A list of awesome AI in libraries, archives, and museum collections from around the world 🕶️☆168Jun 17, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Self hosting code for Recogito-Studio☆22Updated this week
- Python script to create CDX index files of WARC data☆16Sep 7, 2018Updated 7 years ago
- CDXJ Indexing of WARC/ARCs☆33Updated this week
- ☆60Apr 11, 2024Updated 2 years ago
- The official Internet Archive IIIF service☆26Mar 19, 2026Updated last month
- Web Archiving Course☆23Mar 4, 2024Updated 2 years ago
- JavaScript module and CLI tool for working with web archive data using the WACZ format specification.☆17Mar 11, 2025Updated last year
- OCR a IIIF images in a manifest and generate annotations☆26Feb 11, 2025Updated last year
- A client for the Archive-It And Webrecorder WASAPI Data Transfer API☆16Oct 18, 2019Updated 6 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A frontend tool that leverages IIIF manifests and interweaves them into flexible layouts.☆15Jan 12, 2023Updated 3 years ago
- This project is no longer supported. A pre-configured collection of tools including Social Feed Manager and Lentil for easily building Tw…☆16Feb 9, 2018Updated 8 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆58Updated this week
- Create new IIIF Manifests. Modify existing manifests. Tell stories with IIIF. Read the docs: https://manifest-editor-docs.netlify.app/☆41Updated this week
- A VUE IIIF viewer☆14Dec 14, 2025Updated 4 months ago
- Golang WARC (Web ARChive) Library☆30Aug 6, 2019Updated 6 years ago
- ☆11Nov 21, 2025Updated 4 months ago
- LLM Analytics☆709Oct 19, 2024Updated last year
- Specifications developed and maintained by the Webrecorder community.☆141Oct 16, 2025Updated 6 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- IIIF compatible viewer for digital born file storages☆13Apr 1, 2026Updated 2 weeks ago
- OCFL implementation for Go☆16Feb 6, 2026Updated 2 months ago
- Download digitized books from Internet Archive and view with IIIF, locally and offline.☆38Apr 19, 2024Updated 2 years ago
- ☆11Mar 31, 2023Updated 3 years ago
- Typesafe IIIF presentation v3 parsing without external dependencies☆12Apr 9, 2026Updated last week
- A React component for displaying high resolution IIIF images with deep zooming capabilities on mobile and desktop.☆14Jan 7, 2023Updated 3 years ago
- Web archive index server based on RocksDB☆41Apr 1, 2026Updated 2 weeks ago
- mirror a website, put it in a bag☆24Dec 18, 2022Updated 3 years ago
- wabac.js - Web Archive Browsing Augmentation Client☆125Apr 9, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆10Apr 26, 2016Updated 9 years ago
- Instructions, exercises and example data sets for Annif hands-on tutorial☆45Nov 18, 2025Updated 5 months ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆114Feb 12, 2025Updated last year
- (Experimental) High-fidelity capture of Twitter threads as sealed PDFs.☆55Dec 4, 2023Updated 2 years ago
- Run a high-fidelity browser-based web archiving crawler in a single Docker container☆1,020Updated this week
- The command line tool to start IIIF from image file on your hard disc☆30Apr 11, 2026Updated last week
- A command line utility for converting MARC to CSV (and Parquet, etc)☆28Jun 1, 2025Updated 10 months ago