gregjurman / tesserwrapLinks
Python bindings to the Tesseract API
☆66Updated 8 years ago
Alternatives and similar repositories for tesserwrap
Users that are interested in tesserwrap are comparing it to the libraries listed below
Sorting:
- ARCHIVED: A Python API for Tesseract☆20Updated 7 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 11 years ago
- Modularly extensible semantic metadata validator☆84Updated 9 years ago
- Experiments mining image collections using OpenCV☆64Updated 10 years ago
- Python bindings for CLD2.☆16Updated 6 years ago
- Python wrapper for Pandoc—the universal document converter.☆215Updated 9 years ago
- LoadKit supports Extract, Transform, Load processes based on ArchiveKit buckets.☆11Updated 10 years ago
- Utility library to turn country names into ISO two-letter codes☆69Updated last week
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 10 years ago
- ☆12Updated 9 years ago
- Python implementation of URI Template☆78Updated 7 years ago
- Data analysis tool.☆85Updated 2 years ago
- API to extract data from HTML and XML documents☆9Updated 2 years ago
- python library for extracting html microdata☆166Updated 2 years ago
- Python implementation of the Parsley language for extracting structured data from web pages☆92Updated 7 years ago
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- Bringing sanity to world of messed-up data☆66Updated 10 years ago
- A skip dict is a Python dictionary which is permanently sorted by value.☆19Updated 10 years ago
- This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by …☆28Updated 12 years ago
- Backport of Python 3's csv module for Python 2☆64Updated 4 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 9 years ago
- Makes it easy to respect rate limits.☆96Updated 8 years ago
- Mister Bob (the builder) is filesystem template renderer☆69Updated last year
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- Server-side Zotero translation based on Mozilla xpcshell (deprecated)☆38Updated 6 years ago
- pythonic processes☆11Updated 10 years ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- python bindings for rlite☆70Updated 7 years ago
- A cell magic for futurize☆10Updated 9 years ago
- Faster replacement for Python's urlparse module☆46Updated 6 years ago