Convert a corpus of PDF to clean text files on a distributed architecture
☆39Mar 5, 2024Updated 2 years ago
Alternatives and similar repositories for ocr-pipeline
Users that are interested in ocr-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Trading Consequences data and code☆15Mar 5, 2015Updated 11 years ago
- Code for a balancing robot using the Raspberry Pi.☆14Jan 13, 2016Updated 10 years ago
- Functions for analysing public patenting data.☆16Oct 9, 2018Updated 7 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Mar 18, 2024Updated 2 years ago
- Part of eMOP: Franken+ tool for creating font training for Tesseract OCR engine from page images.☆24Sep 24, 2015Updated 10 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code for the CIKM 2013 paper "Discovering Coherent Topics Using General Knowledge"☆11Jul 14, 2014Updated 11 years ago
- https://openfoamwiki.net/index.php/Contrib/PyFoam - Unofficial mirror of svn://svn.code.sf.net/p/openfoam-extend/svn/trunk/Breeder/other/…☆23Aug 14, 2018Updated 7 years ago
- Alfred workflow for Wikipedia☆12Sep 17, 2016Updated 9 years ago
- ☆25Oct 9, 2022Updated 3 years ago
- 🐙 JSON diff diver — the time machine for your JSON objects☆16Apr 8, 2026Updated last month
- R 語言資料分析上手課程☆31Oct 14, 2015Updated 10 years ago
- Mindmap to markdown converter☆11Oct 9, 2016Updated 9 years ago
- Toolbox for OCR post-correction☆122Sep 19, 2019Updated 6 years ago
- 微型四軸飛行器 ( M2M ~80mm )☆26Sep 12, 2014Updated 11 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A Glyph script demonstrating how to automate viscous meshing for aircraft geometry.☆25Feb 27, 2025Updated last year
- A set of Python string distance metrics for string distance comparisons☆27Aug 13, 2011Updated 14 years ago
- Rails application supporting the creation of OCR and the IIIF Content Search API☆34Dec 14, 2022Updated 3 years ago
- 微信手机客户端爬虫,爬取公众号所有 文章、阅读量、点赞量和评论内容☆11Nov 11, 2018Updated 7 years ago
- Want to learn more about Free Law Project technologies, policies and thinking? Get the literature here.☆25Jul 6, 2021Updated 4 years ago
- Glyph Miner, a system for extracting glyphs from early typeset prints☆34Sep 29, 2016Updated 9 years ago
- The second generation of the Triangle Regional Model☆13Apr 13, 2026Updated 3 weeks ago
- OCR for DjVu☆47Oct 3, 2022Updated 3 years ago
- Gamera 3 for Python 2 (deprecated)☆39Aug 15, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"☆21Oct 6, 2015Updated 10 years ago
- Website for America's Public Bible☆11Oct 1, 2020Updated 5 years ago
- Nuerapse simulations for SNNs☆25Oct 10, 2018Updated 7 years ago
- Implementation of DCTTS with Adversarial Training☆12Dec 30, 2019Updated 6 years ago
- Speed up your Localization / Internationalization efforts by automating translation with a single script☆27Feb 18, 2017Updated 9 years ago
- A repository for documentation and tutorials (recipes) that help us cook up great projects☆12Aug 31, 2023Updated 2 years ago
- A module for Omeka S that provides an API for the Neatline 3 single page application☆18Mar 26, 2023Updated 3 years ago
- Generate topic models from open text extracted from files in disk images☆10Apr 11, 2023Updated 3 years ago
- Docs, notes and resources that don't fit elsewhere.☆13May 23, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- R code to get co-citation networks on social networks in the social sciences vs physics and computer science using Web of Science data.☆22Jan 28, 2015Updated 11 years ago
- Image thumbnailing middleware for Connect.js/Express.js utilizing Smartcrop.js☆30Apr 3, 2018Updated 8 years ago
- Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Libr…☆43Updated this week
- LOC Standards, Schemas, Stylesheets, etc.☆11Sep 30, 2025Updated 7 months ago
- Change screen brightness on Linux systems☆12Jul 6, 2015Updated 10 years ago
- 自动排序并合并ppt☆10Aug 24, 2017Updated 8 years ago
- An expandable and scalable OCR pipeline☆90Nov 14, 2017Updated 8 years ago