pdf-association / safedocs
Artifacts from the DARPA-funded SafeDocs research program
☆24Updated last year
Alternatives and similar repositories for safedocs:
Users that are interested in safedocs are comparing it to the libraries listed below
- veraPDF test corpus for ISO 19005 (PDF/A) and ISO 14289 (PDF/UA)☆76Updated last week
- A vendor- and implementation-independent specification-derived, machine-readable model of PDF.☆84Updated last month
- Targeted PDFs demonstrating commonly seen PDF differentials and interoperability issues☆12Updated this week
- ☆10Updated 3 years ago
- Industry-based resolutions for issues and errata reported against any PDF-related specification☆73Updated last week
- PDF Name Registry☆19Updated last month
- ☆14Updated last year
- CDXJ Indexing of WARC/ARCs☆25Updated 4 months ago
- CLI implementation of httpreserve that can test links and retrieve internet archive replacements☆10Updated 5 months ago
- A tool for detecting viruses and NSFW material in WARC files☆14Updated 8 months ago
- Format Identification for Digital Objects (FIDO) is a Python command-line tool to identify the file formats of digital objects. It is des…☆157Updated last month
- Selected code and data for The Online Books Page and related applications☆11Updated last week
- XML Schema for Digital Forensics XML☆35Updated 3 months ago
- A command line utility for listing and searching snapshots in web archives☆16Updated last year
- Efficient hOCR tooling☆44Updated last week
- veraPDF GUI, CLI and installer☆86Updated this week
- This software (prototype) extracts values of Excel spreadsheet properties and calculates a tentative spreadsheet complexity assessment ba…☆13Updated 2 years ago
- A Github Action for turning Markdown into ReSpec HTML☆14Updated 11 months ago
- An openly-licensed corpus of small example files, covering a wide range of formats and creation tools.☆194Updated last year
- Static Site Generator for Viewing Web Archives (in WACZ) format☆26Updated last year
- Collections of individual rules and combined veraPDF validation profiles for various validation flavors☆16Updated last week
- PDF 2.0 example files☆90Updated 3 months ago
- A fork of the disktype disk and disk image format detection tool☆10Updated 8 years ago
- A mirror of the PRONOM file format registry in Linked Open Data format. The Format Registry is a linked (open) data file format repositor…☆10Updated last year
- Command line tool for digging into WARC files☆39Updated this week
- Internet Archive's Sparkling Data Processing Library☆13Updated last month
- A persistent repository for PRONOM Research Week activities☆12Updated 3 years ago
- FileTrove indexes files and creates metadata from them.☆44Updated 3 weeks ago
- An open source set of decks for learning about digital preservation.☆23Updated 5 years ago
- Single server/laptop grade file-observatory☆10Updated 2 years ago