pdf-association / safedocsLinks
Artifacts from the DARPA-funded SafeDocs research program
☆25Updated 2 years ago
Alternatives and similar repositories for safedocs
Users that are interested in safedocs are comparing it to the libraries listed below
Sorting:
- PDF Name Registry☆22Updated last month
- CDXJ Indexing of WARC/ARCs☆27Updated 7 months ago
- XML Schema for Digital Forensics XML☆35Updated 5 months ago
- A vendor- and implementation-independent specification-derived, machine-readable model of PDF.☆86Updated 2 months ago
- A command line utility for listing and searching snapshots in web archives☆16Updated last year
- Targeted PDFs demonstrating commonly seen PDF differentials and interoperability issues☆13Updated 2 months ago
- A Memento Aggregator CLI and Server in Go☆67Updated 5 months ago
- A listing of world wide web archives, for humans and machines using Web Archive Manifest (WAM) yaml format☆53Updated 2 years ago
- Command line tool for digging into WARC files☆44Updated 2 weeks ago
- A Github Action for turning Markdown into ReSpec HTML☆14Updated last year
- File validation and characterisation.☆181Updated this week
- This software (prototype) extracts values of Excel spreadsheet properties and calculates a tentative spreadsheet complexity assessment ba…☆13Updated 2 years ago
- Web archive index server based on RocksDB☆34Updated 3 weeks ago
- Selected code and data for The Online Books Page and related applications☆11Updated this week
- Format Identification for Digital Objects (FIDO) is a Python command-line tool to identify the file formats of digital objects. It is des…☆157Updated 4 months ago
- An openly-licensed corpus of small example files, covering a wide range of formats and creation tools.☆198Updated 2 months ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆52Updated 6 years ago
- A client for the Archive-It And Webrecorder WASAPI Data Transfer API☆16Updated 5 years ago
- BitCurator Environment: Using, building, and maintaining BitCurator☆57Updated last year
- A tool for creating and managing Mailbags, a package for preserving email using multiple preservation formats☆48Updated this week
- Static Site Generator for Viewing Web Archives (in WACZ) format☆27Updated 2 years ago
- Efficient hOCR tooling☆48Updated this week
- Auto-generated static web site digipres.org☆27Updated this week
- signature-based file format identification☆243Updated 3 months ago
- Specifications developed and maintained by the Webrecorder community.☆132Updated 6 months ago
- Converts WARC files to static HTML☆46Updated last year
- A persistent repository for PRONOM Research Week activities☆12Updated 4 years ago
- A tool for detecting viruses and NSFW material in WARC files☆15Updated 11 months ago
- Single server/laptop grade file-observatory☆10Updated 2 years ago
- ☆14Updated last year