rajbot / autocropLinks
This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by an Internet Archive Scribe bookscanner.
☆28Updated 12 years ago
Alternatives and similar repositories for autocrop
Users that are interested in autocrop are comparing it to the libraries listed below
Sorting:
- Experiments mining image collections using OpenCV☆64Updated 10 years ago
- Image comparison QA tool for digital preservation workflows.☆14Updated 10 years ago
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆25Updated 10 years ago
- Language checker and hyphenator extension for LibreOffice☆12Updated 5 years ago
- A MediaWiki-to-HTML parser for Python.☆54Updated 6 years ago
- Convert URL's to a normalized unicode format☆67Updated 7 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- A DSL to build Lucene text queries in Python.☆38Updated 8 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- A Python implementation of the Double Metaphone algorithm☆61Updated 14 years ago
- Utilities for working with data.☆20Updated 10 years ago
- Python bindings to the Tesseract API☆66Updated 9 years ago
- OCR for DjVu☆48Updated 2 years ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- Smart progressbar with multiple backends supporting both explicit updating and tqdm-style iterable-wrapping☆10Updated 8 years ago
- Attempts to determine the natural language of a selection of Unicode (utf-8) text (a clone of http://code.google.com/p/guess-language wit…☆48Updated 15 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆30Updated 13 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- Pythonic framework for working with simulations.☆69Updated 10 years ago
- A git repository indexer (using whoosh as the engine)☆19Updated 10 years ago
- Pyline is a grep-like, sed-like, awk-like command-line tool for line-based text processing in Python. https://pypi.python.org/pypi/pyline☆39Updated 3 months ago
- DEPRECATED - Code for source.mozillaopennews.org/☆37Updated 6 years ago
- Simple to use python library for Buffer App☆23Updated 2 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 11 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 9 years ago
- Import GeoNames.org data into a SQLite database for full-text search and autocomplete☆35Updated 6 years ago
- Color manipulation in python☆114Updated last year
- A storage layer for numeric data that changes over time☆333Updated 9 years ago
- Speech recognition in Python made easy and flexible☆11Updated 10 years ago
- A skip dict is a Python dictionary which is permanently sorted by value.☆19Updated 10 years ago