An openly-licensed corpus of small example files, covering a wide range of formats and creation tools.
☆202Feb 16, 2026Updated 2 weeks ago
Alternatives and similar repositories for format-corpus
Users that are interested in format-corpus are comparing it to the libraries listed below
Sorting:
- Engine for analysis of Siegfried export files and DROID CSV. The tool has three purposes, break the export into its components and store …☆33Dec 17, 2025Updated 2 months ago
- File validation and characterisation.☆200Dec 4, 2025Updated 3 months ago
- Siegfried-based characterization tool for directories and disk images☆91Nov 28, 2025Updated 3 months ago
- Test files for conformance testing and benchmarking Jpylyzer.☆18Apr 2, 2024Updated last year
- Crawl Archivematica's Archival Information Packages (AIP) and provide repository-wide reporting.☆14Updated this week
- Scripts for performing various tasks with the ArchivesSpace API☆14Jun 27, 2024Updated last year
- Auto-generated static web site digipres.org☆30Jan 12, 2026Updated last month
- Community Resource for Archivists and Librarians Scripting☆25Oct 14, 2021Updated 4 years ago
- Create bags based on BagIt profiles and send them off into the ether (EasyStore is now DART)☆58Updated this week
- ☆36Jan 21, 2026Updated last month
- Wrapper around hfsutils to generate DFXML for HFS-formatted disk images☆11Apr 20, 2018Updated 7 years ago
- This software (prototype) extracts values of Excel spreadsheet properties and calculates a tentative spreadsheet complexity assessment ba…☆13Dec 13, 2022Updated 3 years ago
- This is the repository for 2018's collaborative NaNoLiPo project.☆33Nov 1, 2018Updated 7 years ago
- Thoughts toward and tutorial on corpus-driven narrative generation☆25Nov 5, 2020Updated 5 years ago
- The study group Bits and Bots accommodates digital preservation professionals seeking coding abilities. In this repository, you can find …☆41Feb 5, 2026Updated last month
- Generative poetry from a recurrent neural network filtered by emotional and external influences.☆25May 15, 2016Updated 9 years ago
- SCOPE: An access interface for DIPs from Archivematica☆24Feb 13, 2026Updated 3 weeks ago
- DROID (Digital Record and Object Identification)☆364Updated this week
- A web application for human-friendly exploration of Archivematica METS files☆25Sep 20, 2020Updated 5 years ago
- A Github Action for turning Markdown into ReSpec HTML☆15Jun 6, 2024Updated last year
- ☆16Apr 29, 2024Updated last year
- Useful scripts☆16Jan 13, 2026Updated last month
- Insert matching punctuation for mismatched quotation marks, parentheses, etc. Good postprocessing for N-gram text synthesis.☆15Mar 29, 2016Updated 9 years ago
- Nanite - a friendly swarm of format-identifying robots.☆16Dec 19, 2025Updated 2 months ago
- A GUI fuzzing application set up to fuzz calc.exe right now☆37Aug 12, 2020Updated 5 years ago
- Delightful Static Digital Library projects and resources☆35Nov 13, 2025Updated 3 months ago
- ☆31Mar 14, 2017Updated 8 years ago
- CCA Digital Archives Processing Manual☆33Jan 7, 2026Updated last month
- Identify, review, and remove sensitive files☆31Mar 5, 2023Updated 3 years ago
- A Java IIIF Presentation library☆11Feb 18, 2026Updated 2 weeks ago
- A collection of resources re: emulation for preservation and access☆22Dec 10, 2025Updated 2 months ago
- Allows users to import items from a simple CSV (comma separated values) file, and then map the CSV column data to multiple elements, file…☆19Feb 20, 2026Updated 2 weeks ago
- RDF parsing for BaseX☆17Oct 29, 2017Updated 8 years ago
- Work with BagIt packages from Python.☆259Feb 27, 2026Updated last week
- JP2 (JPEG 2000 Part 1) validator and properties extractor. Jpylyzer was specifically created to check that a JP2 file really conforms to …☆80Dec 2, 2025Updated 3 months ago
- signature-based file format identification☆260Jan 23, 2026Updated last month
- This project has been archived and is no longer being developed or supported. The Curator's Workbench is an extensible digital collectio…☆24Jun 25, 2020Updated 5 years ago
- Tropy plugin to import IIIF manifests☆17Sep 16, 2024Updated last year
- Generative Grammar Compiler☆19Nov 1, 2016Updated 9 years ago