rajbot / autocrop
This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by an Internet Archive Scribe bookscanner.
☆28Updated 12 years ago
Alternatives and similar repositories for autocrop:
Users that are interested in autocrop are comparing it to the libraries listed below
- This a module to extract RDF from an HTML5 page annotated with microdata. The module implements the algorithm defined and published by th…☆44Updated 2 years ago
- Import GeoNames.org data into a SQLite database for full-text search and autocomplete☆35Updated 6 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆31Updated 12 years ago
- a web based tool to monitor how your website content is used in wikipedia☆37Updated 4 years ago
- Smart progressbar with multiple backends supporting both explicit updating and tqdm-style iterable-wrapping☆10Updated 8 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 11 years ago
- A python framework to generate html and JavaScript from reusable and combine-able widgets.☆23Updated 2 years ago
- a Simple API for RDF☆29Updated 15 years ago
- Pyline is a grep-like, sed-like, awk-like command-line tool for line-based text processing in Python. https://pypi.python.org/pypi/pyline☆38Updated 9 months ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 3 years ago
- Feedbuffer buffers RSS and Atom syndication feeds, that is to say it caches new feed entries until the news aggregator requests them and …☆19Updated 8 years ago
- Markdown -> IPython conversion tool☆15Updated 10 years ago
- csvcat☆22Updated 9 years ago
- Vidscraper is a python library which provides a simple API for fetching video data from various web services and sites.☆62Updated 2 years ago
- Django framework for crowdsourcing complex tasks using MTurk☆64Updated 14 years ago
- Python library for creating word clouds from text☆51Updated 5 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 8 years ago
- Sometimes you just need a lot of text. Plainstream is a small Python app that provides you with a plain text stream directly from Wikiped…☆24Updated last year
- A skip dict is a Python dictionary which is permanently sorted by value.☆19Updated 10 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 8 years ago
- A MediaWiki-to-HTML parser for Python.☆53Updated 5 years ago
- An Online Logic Assistant Based on Coq☆25Updated 13 years ago
- Language checker and hyphenator extension for LibreOffice☆13Updated 5 years ago
- A Python version (almost a port) of ProPublica's TableFu☆231Updated 11 years ago
- Simple to use python library for Buffer App☆23Updated 2 years ago
- A tool to upload and synchronize static websites to the Amazon S3 cloud.☆22Updated 9 years ago
- Utilities for working with data.☆20Updated 10 years ago
- A simple PDF transcription project for PyBossa☆19Updated 9 years ago
- Structured Data from PDF image-based files☆88Updated 12 years ago