gregjurman / tesserwrapLinks
Python bindings to the Tesseract API
☆66Updated 9 years ago
Alternatives and similar repositories for tesserwrap
Users that are interested in tesserwrap are comparing it to the libraries listed below
Sorting:
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 11 years ago
- Utility library to turn country names into ISO two-letter codes☆71Updated last month
- Python wrapper for Pandoc—the universal document converter.☆215Updated 9 years ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- An extendable docx file format parser and converter☆192Updated 4 months ago
- ARCHIVED: A Python API for Tesseract☆20Updated 8 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- Regular Expression based parsers for extracting data from natural languages☆71Updated 8 years ago
- Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py☆390Updated 2 years ago
- Binary Python bindings for poppler utils for content extraction☆42Updated 4 years ago
- Next-gen web application for public finance data warehouses, formerly OpenSpending☆57Updated 3 years ago
- Dat python client☆45Updated 9 years ago
- A library for extracting tables from PDF files☆92Updated 5 years ago
- Experiments mining image collections using OpenCV☆64Updated 10 years ago
- Unicode transliteration in Python (clone of Tomaž Šolc repository at zemanta.com)☆114Updated 9 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- Primary LocalWiki backend server environment☆47Updated 7 years ago
- Python package for Google's diff-match-patch native C++ implementation.☆82Updated last year
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆154Updated last week
- Sickle: OAI-PMH for Humans☆114Updated 2 years ago
- A data processing pipeline that schedules and runs content harvesters, normalizes their data, and outputs that normalized data to a varie…☆41Updated 9 years ago
- WaterButler is a Python web application for interacting with various file storage services via a single RESTful API, developed at Center …☆62Updated this week
- csvcat☆22Updated 9 years ago
- video indexing site☆215Updated 9 years ago
- Modularly extensible semantic metadata validator☆84Updated 9 years ago
- A library for extracting tables from PDF files☆89Updated 11 years ago
- Snowball stemming library collection for Python☆121Updated 6 years ago
- Backport of Python 3's csv module for Python 2☆64Updated 4 years ago
- A library to easily measure what's going on in your python.☆305Updated 5 years ago
- A persistent, full-text searchable key-value store. Powered by Flask, ElasticSearch, S3, and good intentions.☆478Updated 8 years ago