usnistgov/ocr-pipeline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/usnistgov/ocr-pipeline)

usnistgov / ocr-pipeline

Convert a corpus of PDF to clean text files on a distributed architecture

☆40

Alternatives and similar repositories for ocr-pipeline

Users that are interested in ocr-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linkalis / shiny-qualitativesurveytool
View on GitHub
qualitative analysis tool built for R + Shiny
☆12Nov 10, 2014Updated 11 years ago
EuropeanaNewspapers / ner-app
View on GitHub
Named Entity Recognition tool for Europeana Newspapers
☆14Apr 5, 2018Updated 8 years ago
digtrade / digtrade
View on GitHub
Trading Consequences data and code
☆15Mar 5, 2015Updated 11 years ago
Early-Modern-OCR / FrankenPlus
View on GitHub
Part of eMOP: Franken+ tool for creating font training for Tesseract OCR engine from page images.
☆24Sep 24, 2015Updated 10 years ago
andbue / nashi
View on GitHub
Some bits of javascript to transcribe scanned pages using PageXML
☆17May 27, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nogenmyr / swiftSnap
View on GitHub
☆14May 28, 2023Updated 3 years ago
openfoamtutorials / OpenFOAM_Additions
View on GitHub
OpenFOAM Additions; Code that was geared to specific purposes and meant to be easily used by others.
☆15Apr 8, 2014Updated 12 years ago
ChatSecure / ChatSecure-Push-iOS
View on GitHub
The iOS SDK for ChatSecure-Push-Server
☆16Oct 27, 2019Updated 6 years ago
appcelerator-developer-relations / appc-sample-watchos2
View on GitHub
This app demonstrates WatchSession support in Titanium 5.0
☆11Sep 6, 2019Updated 6 years ago
UW-xDD / table-extract
View on GitHub
Locate and extract tables and figures in PDFs
☆43Mar 19, 2021Updated 5 years ago
robocorp / example-desktop-image-ocr
View on GitHub
Example robot for automating GnuCash with image templates and OCR
☆12Jan 23, 2024Updated 2 years ago
Comflics / Exploring-OpenFOAM
View on GitHub
Exploring OpenFOAM®
☆19Jan 30, 2018Updated 8 years ago
KBNLresearch / ochre
View on GitHub
Toolbox for OCR post-correction
☆120Sep 19, 2019Updated 6 years ago
liangstein / ByteNet-Keras
View on GitHub
French to English translator on character level implemented by Keras
☆10Jun 15, 2017Updated 9 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
appcelerator-developer-relations / plexus-rx
View on GitHub
PlexusRx demo app for iOS 11
☆14Nov 3, 2017Updated 8 years ago
appcelerator-developer-relations / appc-sample-appsearch
View on GitHub
This sample app demonstrates how to make the activities and content of your app searchable via Spotlight, Safari and Siri by using new AP…
☆12Oct 13, 2015Updated 10 years ago
benedikt-budig / glyph-miner
View on GitHub
Glyph Miner, a system for extracting glyphs from early typeset prints
☆34Sep 29, 2016Updated 9 years ago
NCSU-Libraries / ocracoke
View on GitHub
Rails application supporting the creation of OCR and the IIIF Content Search API
☆33Dec 14, 2022Updated 3 years ago
rmgirardin / mouse-bot
View on GitHub
A Discord bot for Star Wars: Galaxy of Heroes
☆12Mar 10, 2019Updated 7 years ago
freelawproject / related-literature
View on GitHub
Want to learn more about Free Law Project technologies, policies and thinking? Get the literature here.
☆25Jul 6, 2021Updated 5 years ago
pirate / awesome-web-archiving
View on GitHub
An Awesome List for getting started with web archiving
☆19Dec 21, 2018Updated 7 years ago
brett-chen / AMC
View on GitHub
Code for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"
☆21Oct 6, 2015Updated 10 years ago
lwrubel / loc-colors
View on GitHub
Colors in Library of Congress digital images.
☆32Jan 8, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
az2009 / cielo-m2
View on GitHub
☆11Jan 20, 2021Updated 5 years ago
sld / torch-conv-ner
View on GitHub
Deep learning for named entity recognition on CoNLL-2003
☆10Dec 23, 2016Updated 9 years ago
BitCurator / bitcurator-nlp-gentm
View on GitHub
Generate topic models from open text extracted from files in disk images
☆10Apr 11, 2023Updated 3 years ago
TrsNium / Pix2Pix
View on GitHub
🍣Transfer image style 🍣
☆11Jun 2, 2017Updated 9 years ago
raffaelevacca / EUSN-co-citation-networks
View on GitHub
R code to get co-citation networks on social networks in the social sciences vs physics and computer science using Web of Science data.
☆22Jan 28, 2015Updated 11 years ago
mikhaildubov / AST-text-analysis
View on GitHub
Statistical Natural Language Processing with Annotated Suffix Trees
☆22Jul 22, 2016Updated 9 years ago
kitodo / kitodo-presentation
View on GitHub
Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Libr…
☆44Updated this week
parthasm / Viterbi-Bigram-HMM-Parts-Of-Speech-Tagger
View on GitHub
A Python implementation of the Viterbi Algorithm with Bigram Hidden Markov Model(HMM) taggers for predicting Parts of Speech(POS) tags. -…
☆12Feb 9, 2016Updated 10 years ago
jonschlinkert / is-valid-path
View on GitHub
Returns true if a windows file path does not contain any invalid characters.
☆12Jan 27, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ScienceStories / application
View on GitHub
This repo holds the source code for the web application
☆15Jul 6, 2023Updated 3 years ago
PRImA-Research-Lab / prima-aletheia-web-emop
View on GitHub
Web-based page layout editor created for EMOP (Early Modern OCR Project).
☆11May 21, 2021Updated 5 years ago
mikefogg / BarcodeView
View on GitHub
An Appcelerator Titanium Module that allows you to create a barcode scanner view
☆25Jul 29, 2016Updated 9 years ago
Bookworm-project / BookwormAPI
View on GitHub
An API implementing a grammar for text analysis
☆13Nov 10, 2015Updated 10 years ago
chuharev / grnti-grabber
View on GitHub
Tools for handling GRNTI list
☆10Sep 2, 2023Updated 2 years ago
DHRI-Curriculum / text-analysis
View on GitHub
@DHRI-Curriculum Session on text analysis with NLTK, including discussion of cleaning data, creating text corpora, and analyzing texts pr…
☆11May 13, 2021Updated 5 years ago
jneubert / skos-history
View on GitHub
Ontology, processing practices and supporting code for change tracking of SKOS vocabularies
☆41Jan 26, 2024Updated 2 years ago