Fast PDF generation and compression. Deals with millions of pages daily.
☆136Mar 2, 2026Updated this week
Alternatives and similar repositories for archive-pdf-tools
Users that are interested in archive-pdf-tools are comparing it to the libraries listed below
Sorting:
- Convert ALTO XML to plain text + minimal metadata☆17Oct 17, 2024Updated last year
- Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives☆16Jun 10, 2021Updated 4 years ago
- DjVu encoder with foreground/background separation☆14Oct 17, 2025Updated 4 months ago
- Ergonomic line-by-line transcription of scanned text.☆54Feb 2, 2026Updated last month
- Google Sheets to SQLite CLI tool.☆13Aug 15, 2023Updated 2 years ago
- A Hypothes.is integration plugin for OJS☆12Mar 17, 2025Updated 11 months ago
- Implementation of the Euclidean-Rhythms idea in the form of plugin☆13Apr 10, 2024Updated last year
- A Python library to add reconstructed pronunciations of Middle Chinese on Chinese texts☆11Mar 13, 2023Updated 2 years ago
- Lua binding for the lol-HTML rewriter/parser☆19Nov 14, 2020Updated 5 years ago
- Hubcap is an autonomous AI agent in 25 lines of code: a small Autobot that you can't trust. *This is the Python fork/port* from https://g…☆22Nov 10, 2025Updated 3 months ago
- container gitops in a simple way☆15Nov 18, 2024Updated last year
- Image Annotation Tool and Image Search☆16Updated this week
- Correct Skewed pdf documents using Hough Line Transform and Fourier Transform☆18Sep 6, 2019Updated 6 years ago
- A tool that democratizes and standardizes access to Web APIs.☆14Mar 2, 2023Updated 3 years ago
- Conversions between various OCR formats☆83Feb 13, 2026Updated 3 weeks ago
- Homebrew formula and App bundler for Scantailor (Advanced)☆175Jan 26, 2026Updated last month
- ScanTailor Universal - a fork based on Enhanced+Featured+Master versions of ST☆243Updated this week
- CollectionBuilder-CSV is a "stand alone" template for creating digital collection and exhibit websites using Jekyll and a metadata CSV.☆39Feb 20, 2026Updated 2 weeks ago
- A simple link checker in Go☆14Feb 6, 2026Updated last month
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆41Nov 24, 2025Updated 3 months ago
- Pixi Object Model☆14Sep 7, 2018Updated 7 years ago
- An XML parser for lezer☆16Dec 27, 2024Updated last year
- Datasette plugin for inserting and updating data☆20Mar 29, 2024Updated last year
- ISCC - Software Development Kit☆19Updated this week
- Tools for working with book data☆19Nov 25, 2025Updated 3 months ago
- Named Entity Recognition☆19Feb 13, 2026Updated 3 weeks ago
- A command line utility for listing and searching snapshots in web archives☆17Dec 21, 2023Updated 2 years ago
- GUI widgets for shell scripts. This is a scriptable engine which implements the idea of a GUI <-> text filter tool.☆13Nov 16, 2020Updated 5 years ago
- A scraper that retrieves install/live images (.iso files) from different distributions☆20Dec 10, 2022Updated 3 years ago
- Use triggers to track when rows in a SQLite table were updated or deleted☆54Feb 15, 2026Updated 2 weeks ago
- Command line tool for digging into WARC files☆51Feb 27, 2026Updated last week
- ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones …☆1,417Sep 13, 2023Updated 2 years ago
- Leverage TAP to transform your ugly make outputs into nice readable ones using any TAP reporter☆18Feb 16, 2016Updated 10 years ago
- Fork of anarki Arc with changes to the news code to support twostopbits.com☆20Jan 12, 2026Updated last month
- DuckDB Engine as Google Sheets Library☆20Dec 14, 2024Updated last year
- Self hosting code for Recogito-Studio☆20Oct 16, 2025Updated 4 months ago
- Tropy plugin to import IIIF manifests☆17Sep 16, 2024Updated last year
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆56Oct 21, 2018Updated 7 years ago
- FileTrove indexes files and creates metadata from them.☆56Jan 4, 2026Updated 2 months ago