danvk / boxeditLinks
A web-based editor for Tesseract box files
☆29Updated 11 years ago
Alternatives and similar repositories for boxedit
Users that are interested in boxedit are comparing it to the libraries listed below
Sorting:
- An implementation of RESTful web service for tesseract-OCR using tornado☆136Updated 2 years ago
- An expandable and scalable OCR pipeline☆89Updated 8 years ago
- Exploring extracting tables from a PDF to CSV using PDF.JS☆105Updated 9 years ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆135Updated 3 months ago
- batch Optical Mark Recognition without foresight☆39Updated last year
- official diybookscanner repository☆39Updated 11 years ago
- pure javascript lstm rnn implementation based on ocropus☆38Updated 11 years ago
- Python binding to libpoppler with focus on text extraction☆97Updated 4 years ago
- Extract postal addresses from the DOM☆66Updated 13 years ago
- Recognition Models for Kraken and CLSTM☆16Updated 6 years ago
- Apache OpenNLP wrapper for Nodejs☆56Updated 6 years ago
- crawler for YouTube☆47Updated 11 years ago
- Facilitating the global conversation on academic literature☆267Updated 8 years ago
- Demos for the limdu.js package☆18Updated 3 years ago
- Receipt scanner extracts information from your PDF or image receipts - built in NodeJS☆307Updated 7 years ago
- A library for extracting tables from PDF files☆89Updated 12 years ago
- A small framework taking over the manual training process described in the Tesseract3 Wiki: https://code.google.com/p/tesseract-ocr/wiki/…☆131Updated 2 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- A small Docker built for the OCRopus OCR system.☆19Updated 8 years ago
- An api to parse a CV, in particular the elements of its publication list☆35Updated 7 years ago
- Server endpoint for Meteor's DDP protocol in C++☆19Updated 9 years ago
- Fill PDF forms and return either filled PDF or PDF created from rendered page images.☆232Updated 3 years ago
- Next generation OCR engine based on LSTMs.☆52Updated 7 years ago
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆80Updated last month
- PM2 module to redirect application logs to syslog☆41Updated 3 years ago
- Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.☆84Updated 9 years ago
- A node.js library for processing and understanding scanned documents☆340Updated 3 years ago
- Illuminating the forest AND the trees in your data☆38Updated 9 years ago
- interactive network visualization☆104Updated 2 months ago
- search, dedupe, and media ingestion for mediachain☆32Updated 9 years ago