danvk / boxeditLinks
A web-based editor for Tesseract box files
☆28Updated 11 years ago
Alternatives and similar repositories for boxedit
Users that are interested in boxedit are comparing it to the libraries listed below
Sorting:
- A node.js library for extracting data from scanned forms.☆117Updated 3 years ago
- An implementation of RESTful web service for tesseract-OCR using tornado☆136Updated 2 years ago
- Exploring extracting tables from a PDF to CSV using PDF.JS☆104Updated 9 years ago
- An expandable and scalable OCR pipeline☆89Updated 8 years ago
- A library for extracting tables from PDF files☆89Updated 12 years ago
- Extract tables from PDF pages.☆298Updated 5 years ago
- Ocular is a state-of-the-art historical OCR system.☆266Updated last year
- Python binding to libpoppler with focus on text extraction☆97Updated 4 years ago
- Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.☆84Updated 9 years ago
- Extract postal addresses from the DOM☆66Updated 13 years ago
- Mapping photos of Old New York☆293Updated last year
- A node.js library for processing and understanding scanned documents☆340Updated 3 years ago
- OCR evaluation brought to you by University of Alicante☆67Updated 3 years ago
- A small framework taking over the manual training process described in the Tesseract3 Wiki: https://code.google.com/p/tesseract-ocr/wiki/…☆131Updated 2 years ago
- Evaluating the performance and accuracy of ABBYY FineReader's OCR on Senate Financial Disclosure scanned forms☆135Updated 9 years ago
- Node libary to stream CouchDB changes into PostgreSQL☆115Updated 4 years ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆134Updated 2 months ago
- crawler for YouTube☆47Updated 11 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 9 months ago
- Working with hOCR in Javascript☆136Updated 2 years ago
- An online annotation platform for teaching and learning in the humanities.☆108Updated last month
- Binary Python bindings for poppler utils for content extraction☆42Updated 4 years ago
- Apache Tika Server as a Docker Image☆174Updated 3 years ago
- gathering point for open source OCR scripts and diffs☆43Updated 11 years ago
- A small Docker built for the OCRopus OCR system.☆19Updated 8 years ago
- Apache OpenNLP wrapper for Nodejs☆56Updated 6 years ago
- IMAP => Apache CouchDB (and Cloudant) archiver/importer/thing☆17Updated 9 years ago
- Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO☆228Updated 3 years ago
- 'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy.☆22Updated 7 years ago
- Helps you extract CSV data tables from PDF files using the mighty tabula-java. See https://github.com/tabulapdf/tabula-java☆81Updated 6 years ago