internetarchive / analyze_ocr
Parse OCR result files for pagenos, tables of contents, etc.
☆14Updated 13 years ago
Alternatives and similar repositories for analyze_ocr:
Users that are interested in analyze_ocr are comparing it to the libraries listed below
- A Rails engine supporting the discovery of web archives.☆50Updated last year
- Automatic alignment of books between HathiTrust, Internet Archive, Google Books, etc.☆35Updated last week
- This software (prototype) extracts values of Excel spreadsheet properties and calculates a tentative spreadsheet complexity assessment ba…☆13Updated 2 years ago
- Open ONI (Open Online Newspaper Initiative) Django web app☆49Updated this week
- No longer maintained. Please use conciliator instead.☆26Updated 4 years ago
- work to make the ldr premis compliant☆8Updated 8 years ago
- DEPRECATED. Replaced with Electron desktop application: https://github.com/bulk-reviewer/bulk-reviewer☆13Updated 5 years ago
- Crawl Archivematica's Archival Information Packages (AIP) and provide repository-wide reporting.☆12Updated this week
- ☆14Updated 8 years ago
- Django app for managing PREMIS Events☆14Updated last month
- Tools for helping you work with web platform archive downloads.☆17Updated 5 years ago
- This repo holds the source code for the web application☆15Updated last year
- WASAPI data transfer APIs☆44Updated 2 years ago
- All that entity matching, resolution, normalization, enhancement and reconciliation madness, but with a focus on data, not platforms.☆24Updated 3 years ago
- A Rails engine for metadata aggregation, enhancement, and quality control.☆29Updated 8 years ago
- Prototype SOLR-powered web archive exploration UI.☆43Updated 4 years ago
- Erweiterung von Zotero für die Katalogisierung☆49Updated last year
- rightsstatements.org data model☆12Updated 2 years ago
- ☆61Updated 2 years ago
- Open Semantic Search Appliance (VM)☆12Updated 4 years ago
- Open-source tools for working with BIBFRAME (see: http://bibframe.org), by default BIBFRAME Lite (see: http://bibfra.me) and more general…☆24Updated 3 years ago
- A python client for the DPLA API☆43Updated 2 years ago
- utility to fetch provenance information from Internet Archive's Wayback Machine☆13Updated 2 years ago
- A curated list of awesome Jupyter projects and guides from the GLAM community.☆19Updated 3 years ago
- Command-line tile downloader/assembler for IIIF endpoints/manifests☆34Updated 3 years ago
- GeoNames Reconciliation Service for OpenRefine/LODRefine/Google Refine☆48Updated 3 years ago
- The Web Curator Tool is a tool for managing the selective web harvesting process. (moved from SourceForge). https://webcurator.slack.com …☆27Updated 2 years ago
- Public website for version controlled Samvera documentation (mostly Hyrax)☆7Updated 7 months ago
- Prototype wikidata portal project.☆10Updated 10 months ago
- Docker image for the Archives Unleashed Toolkit☆12Updated 2 years ago