internetarchive/analyze_ocr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/internetarchive/analyze_ocr)

internetarchive / analyze_ocr

Parse OCR result files for pagenos, tables of contents, etc.

☆14

Alternatives and similar repositories for analyze_ocr

Users that are interested in analyze_ocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

PRImA-Research-Lab / prima-aletheia-web-emop
View on GitHub
Web-based page layout editor created for EMOP (Early Modern OCR Project).
☆11May 21, 2021Updated 5 years ago
steffenfritz / html2warc
View on GitHub
simple script to convert web resources to a single warc file
☆24May 11, 2023Updated 3 years ago
time-machine-project / requests-for-comments
View on GitHub
The main repository for Time Machine Requests for Comments drafts and releases
☆12Mar 24, 2025Updated last year
PRImA-Research-Lab / prima-core-libs
View on GitHub
Core libraries by the PRImA Research Lab
☆16Jul 30, 2024Updated last year
IBM / graph-db-insights
View on GitHub
Get insights from OrientDB database using PyOrient through IBM Watson Studio
☆13Apr 22, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
samvera-labs / geomash
View on GitHub
This is a geographic parsing library. Beyond regular parsing to mapquest, bing, and google apis, it also can parse subject strings and qu…
☆20May 6, 2026Updated 2 months ago
Xion / taipan
View on GitHub
General purpose toolkit for Python
☆11Jan 27, 2024Updated 2 years ago
zmbq / pyexistdb
View on GitHub
Utilities for accessing and searching objects in an eXist-db XML database using idiomatic Python, XPath, and XQuery
☆12Feb 21, 2024Updated 2 years ago
PRImA-Research-Lab / prima-gwt-lib
View on GitHub
Library with user interface elements and client-server communication classes based on Google Web Toolkit (GWT) that can be used for crowd…
☆14Oct 3, 2017Updated 8 years ago
leovt / leovt
View on GitHub
Collection of Sample Programs
☆10Nov 17, 2020Updated 5 years ago
europeana / europeana-portal-collections
View on GitHub
Europeana Collections portal as a Rails + Blacklight application.
☆19Apr 11, 2022Updated 4 years ago
amyreese / nib
View on GitHub
static site generator with content pipeline
☆19Dec 25, 2023Updated 2 years ago
google / DAPLink-port
View on GitHub
☆18Jun 29, 2022Updated 4 years ago
ncbo / goo
View on GitHub
Graph Oriented Objects (GOO) for Ruby. A RDF/SPARQL based ORM.
☆15Updated this week
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
stevenhauser / atom-simple-align
View on GitHub
Simple multiple cursor alignment for Atom text editor
☆13Nov 5, 2019Updated 6 years ago
altomator / EN-data_mining
View on GitHub
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
☆25Feb 6, 2026Updated 5 months ago
tacnetsol / TRENDNetExploits
View on GitHub
Exploits for TRENDNet routers
☆14Apr 21, 2020Updated 6 years ago
google / vbootrom
View on GitHub
☆16Jan 14, 2026Updated 6 months ago
dpla-attic / alpha-platform
View on GitHub
The Alpha DPLA Platform repository
☆34Apr 24, 2012Updated 14 years ago
JamesCoyle / HistoryExtension
View on GitHub
☆16Oct 26, 2022Updated 3 years ago
sergiotapia / ekeko
View on GitHub
Ekeko is a tool that helps you save all of your favorited memes, videos and other online resources.
☆15Oct 27, 2022Updated 3 years ago
openownership / register
View on GitHub
A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia
☆19Oct 30, 2024Updated last year
wihl / Timberwolf
View on GitHub
Hadoop HBase ingestion of Microsoft Exchange
☆15Apr 6, 2012Updated 14 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
riktw / SoftcoreComparisons
View on GitHub
The code for an FPGA softcore comparison
☆11Jun 21, 2020Updated 6 years ago
DSTCyber / safe-deobs
View on GitHub
A static deobfuscator for JavaScript Malware
☆13May 6, 2020Updated 6 years ago
jspenguin2017 / JavaScriptAnalyzer
View on GitHub
A tool that helps with analysis of obfuscated JavaScript
☆11Dec 15, 2023Updated 2 years ago
Aloisius / nutch
View on GitHub
CommonCrawl Test version of Nutch
☆16Jul 10, 2014Updated 12 years ago
plexus / .emacs.d
View on GitHub
My Emacs configuration
☆11Oct 12, 2021Updated 4 years ago
nirbheek / youtube-ass
View on GitHub
Converts YouTube XML Annotations to ASS subtitles
☆17Dec 17, 2018Updated 7 years ago
flokli / pynq
View on GitHub
Tooling to use the Pynq Board somewhat nicely
☆13Dec 5, 2022Updated 3 years ago
cdbeland / moss
View on GitHub
Searching for misspelling, bad grammar, and violations of the Manual of Style in Wikipedia
☆13May 28, 2026Updated last month
cds4 / inkscape-grids
View on GitHub
Triangular and perspective grid creation extensions for inkscape
☆20Aug 18, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
web-archive-group / hackathon
View on GitHub
☆14Feb 28, 2017Updated 9 years ago
whisk-ml / whisk
View on GitHub
whisk is a data science project framework that makes collaboration, reproducibility, and deployment "just work".
☆11Dec 26, 2022Updated 3 years ago
guerilla-di / timecode
View on GitHub
Work with SMPTE timecode data
☆26May 16, 2024Updated 2 years ago
kanishka-linux / ReadManga
View on GitHub
A GNU/Linux Desktop Application for reading Japanese Manga from various sites available on the internet
☆17May 16, 2017Updated 9 years ago
oozie / python-fsm
View on GitHub
Python Finite State Machine implementation with a pygraphviz hook
☆21Apr 3, 2019Updated 7 years ago
cyan06 / automatic-save-folder
View on GitHub
Firefox Addon to automatically select the save folder (local destination folder) based on filters.
☆12Apr 16, 2016Updated 10 years ago
benosteen / pairtree
View on GitHub
Python Pairtree implementation
☆19Mar 26, 2018Updated 8 years ago