factful/ocr_testing

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/factful/ocr_testing)

factful / ocr_testing

Scripts and results from our OCR roundup, available on Source

☆149

Alternatives and similar repositories for ocr_testing

Users that are interested in ocr_testing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NVlabs / ocropus3
View on GitHub
Repository collecting all the submodules for the new PyTorch-based OCR System.
☆141Feb 22, 2021Updated 5 years ago
ghing / python-data-cheatsheet
View on GitHub
Things in Python, Pandas, GeoPandas and Jupyter that I've had to look up or weren't obvious in the documentation.
☆15May 29, 2026Updated last month
The-Politico / generator-politico-graphics
View on GitHub
☆10Mar 10, 2019Updated 7 years ago
LivingSkyTechnologies / Dense_Article_Dataset_DAD
View on GitHub
Dense Article Dataset (DAD): A Benchmark Dataset for Document Layout Analysis
☆16Jan 13, 2022Updated 4 years ago
ocropus-archive / DUP-ocropy2
View on GitHub
Next generation OCR engine based on LSTMs.
☆51Apr 8, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
PRImA-Research-Lab / prima-core-libs
View on GitHub
Core libraries by the PRImA Research Lab
☆16Jul 30, 2024Updated last year
Calamari-OCR / calamari
View on GitHub
Line based ATR Engine based on OCRopy
☆1,198Jun 23, 2026Updated last month
DallasMorningNews / socrata2sql
View on GitHub
An SQL loader for datasets published via Socrata
☆28Dec 8, 2022Updated 3 years ago
jeremybmerrill / wayback2csv
View on GitHub
transform a datapoint from a website into a CSV time-series dataset using the wayback machine
☆12May 24, 2023Updated 3 years ago
chequeado / chequeabot-legacy
View on GitHub
This repository contains all the tools we are working with related to Chequeabot's ecosystem.
☆15May 27, 2025Updated last year
LivingSkyTechnologies / Document_Layout_Segmentation
View on GitHub
Repository to use/train segmentation models for document layout analysis
☆19Jan 13, 2022Updated 4 years ago
Doreenruirui / okralact
View on GitHub
A repository for online OCRD training infrastructure.
☆13Aug 20, 2020Updated 5 years ago
PRImA-Research-Lab / prima-page-converter
View on GitHub
Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as wel…
☆25Jan 30, 2021Updated 5 years ago
datamade / pdf-textextract
View on GitHub
Docker Container for a Make-based, PDF extraction using OCR
☆14Jul 31, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
seuretm / ocrd_typegroups_classifier
View on GitHub
☆10Mar 16, 2023Updated 3 years ago
Quartz / aistudio-dochate-public
View on GitHub
Learning text classification for journalists through DocHate tips
☆10May 13, 2020Updated 6 years ago
eyeseast / datasette-geojson-map
View on GitHub
Render a map for any query with a geometry column
☆29Aug 10, 2024Updated last year
glenrobson / iiif_stuff
View on GitHub
IIIF Examples and useful code
☆20Sep 10, 2025Updated 10 months ago
arianagiorgi / postgis-intro
View on GitHub
☆11Mar 9, 2019Updated 7 years ago
VeitL / OCR
View on GitHub
Tensorflow re-implementation of the recognition part the paper "Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Rec…
☆24Aug 31, 2018Updated 7 years ago
newsdev / int-newsapps-template
View on GitHub
A template for creating new INT News Apps applications in Django or Flask
☆13Feb 15, 2018Updated 8 years ago
thecarebot / carebot-tracker
View on GitHub
carebot-tracker.js — Carebot's tracking component for Google Analytics events
☆17Apr 19, 2016Updated 10 years ago
ocropus-archive / DUP-ocropy
View on GitHub
Python-based tools for document analysis and OCR
☆3,467May 22, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
javiferran / document-classification
View on GitHub
☆15Jun 22, 2020Updated 6 years ago
sambalshikhar / Document-Image-Classification-with-Intra-Domain-Transfer-Learning-and-Stacked-Generalization-of-Deep
View on GitHub
RVL-CDIP could be looked at as the equivalent of ImageNet for the document image community. It’s certainly the largest we’ve seen in the …
☆18Nov 4, 2019Updated 6 years ago
ocropus / hocr-tools
View on GitHub
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
☆416Aug 10, 2024Updated last year
naptha / wrinkle
View on GitHub
a little nodejs server and script that extracts letters from images via tesseract
☆19Mar 4, 2015Updated 11 years ago
code4policy / modules
View on GitHub
Course Materials for DPI-691M - "Programming and Data for Policymakers"
☆17Jan 17, 2026Updated 6 months ago
The-Politico / gspan.js
View on GitHub
Parses Google Documents formatted for annotated transcripts –– with JavaScript
☆18Feb 14, 2022Updated 4 years ago
hiarindam / document-image-classification-TL-SG
View on GitHub
Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks
☆43Nov 16, 2019Updated 6 years ago
cjdd3b / nicar2013
View on GitHub
Various documents and code examples for NICAR 2013 presentations.
☆38Mar 1, 2013Updated 13 years ago
rdmurphy / node-copytext
View on GitHub
A module for accessing a XLSX spreadsheet as a JavaScript object.
☆16Aug 25, 2019Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tmbdev-talks / icdar2019-worksheets
View on GitHub
☆25Apr 18, 2020Updated 6 years ago
tmbdev-talks / icdar2019-tutorial
View on GitHub
☆126Apr 18, 2020Updated 6 years ago
lquirosd / P2PaLA
View on GitHub
Page to PAGE Layout Analysis Tool
☆192Jan 17, 2022Updated 4 years ago
NVlabs / ocropus3-ocrorot
View on GitHub
Rotation and skew detection using DL.
☆60May 29, 2018Updated 8 years ago
trallard / opendata-airflow-tutorial
View on GitHub
Tutorials on airflow pipelines with open data sets
☆11Sep 2, 2019Updated 6 years ago
pietercolpaert / jsonld-stream
View on GitHub
A specification for a jsonld nodejs stream
☆15Oct 13, 2018Updated 7 years ago
watersink / ocrsegment
View on GitHub
a deep learning model for page layout analysis / segmentation.
☆101Nov 4, 2019Updated 6 years ago