Various documents related to Tesseract OCR
☆267Sep 12, 2021Updated 4 years ago
Alternatives and similar repositories for docs
Users that are interested in docs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Tesseract documentation☆75Sep 12, 2021Updated 4 years ago
- Source training data for Tesseract for lots of languages☆867Apr 1, 2025Updated last year
- Tesseract Open Source OCR Engine (main repository)☆74,159Apr 27, 2026Updated 3 weeks ago
- Trained models with fast variant of the "best" LSTM models + legacy models☆7,535Mar 9, 2024Updated 2 years ago
- Part of eMOP: Franken+ tool for creating font training for Tesseract OCR engine from page images.☆24Sep 24, 2015Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Python-based tools for document analysis and OCR☆3,471May 22, 2021Updated 5 years ago
- A small Docker built for the OCRopus OCR system.☆19Dec 16, 2017Updated 8 years ago
- A library and command-line tool for fetching Facebook Pages' published posts.☆13Jul 18, 2017Updated 8 years ago
- ☆16Mar 24, 2021Updated 5 years ago
- A small C++ implementation of LSTM networks, focused on OCR.☆831Oct 24, 2019Updated 6 years ago
- Train Tesseract LSTM with make☆720Apr 18, 2025Updated last year
- Double-checked Gold Standard Data for Training and Testing OCR Engines☆18Jun 15, 2017Updated 8 years ago
- A tools can generate samples for OCR trainning. 用于OCR的字符样本生成工具☆65Oct 22, 2017Updated 8 years ago
- 🖺 OCR using tensorflow with attention☆644Sep 5, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A Python wrapper for the tesseract-ocr API☆2,166Mar 16, 2026Updated 2 months ago
- ☆17Mar 8, 2018Updated 8 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆411Aug 10, 2024Updated last year
- transform a datapoint from a website into a CSV time-series dataset using the wayback machine☆12May 24, 2023Updated 2 years ago
- Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The …☆2,046May 10, 2026Updated last week
- Library with user interface elements and client-server communication classes based on Google Web Toolkit (GWT) that can be used for crowd…☆14Oct 3, 2017Updated 8 years ago
- Files and Scripts to run Tesseract 5 LSTM Training using fonts☆79Feb 6, 2022Updated 4 years ago
- ☆48Mar 24, 2023Updated 3 years ago
- Links to awesome OCR projects☆3,108Jul 6, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Repository collecting all the submodules for the new PyTorch-based OCR System.☆141Feb 22, 2021Updated 5 years ago
- A star path planning algorithm based line segmentation of handwritten document☆21Mar 4, 2023Updated 3 years ago
- Best (most accurate) trained LSTM models.☆1,547Mar 9, 2024Updated 2 years ago
- A collection of Django extensions that add content-management facilities to Django projects.☆41May 7, 2015Updated 11 years ago
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆196May 13, 2026Updated last week
- a general list of resources and articles for people interested in getting into data journalism☆16Apr 12, 2023Updated 3 years ago
- Grab nonprofit tax information from the ProPublica API and put it in a Google spreadsheet!☆14Jun 2, 2017Updated 8 years ago
- Converters for various file formats used for representing OCR☆12Apr 30, 2025Updated last year
- Code from NICAR 2020☆17Jan 28, 2021Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Website for Bazel, a fast, scalable, multi-language and extensible build system☆19Jul 7, 2022Updated 3 years ago
- Training files produced for and by the Tesseract OCR engine for work on the Early Modern OCR Project (eMOP)☆37Sep 24, 2015Updated 10 years ago
- FOIL resources for New York City and New York State☆18Dec 16, 2015Updated 10 years ago
- Very basic Tesseract-OCR example with CPPAN. Cppan support is discontinued. Please use sw (cppan v2) instead. Updated example is here: ht…☆31Jul 9, 2018Updated 7 years ago
- A tutorial on the PyTorch-based ocropus components.☆73Apr 18, 2020Updated 6 years ago
- ☆19Mar 20, 2019Updated 7 years ago
- Source and scripts for generating DCMI Metadata Terms documentation☆33Feb 3, 2018Updated 8 years ago