rajbot / autocrop
This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by an Internet Archive Scribe bookscanner.
☆28Updated 12 years ago
Alternatives and similar repositories for autocrop
Users that are interested in autocrop are comparing it to the libraries listed below
Sorting:
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 3 years ago
- A simple PDF transcription project for PyBossa☆19Updated 9 years ago
- PIL-compatible interface for platform libraries such as GraphicsMagick, Aware or JAI.☆25Updated 7 years ago
- a web based tool to monitor how your website content is used in wikipedia☆37Updated 4 years ago
- Wikipedia citation tool for Google Books, New York Times, ISBN, DOI and more☆22Updated 8 years ago
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆24Updated 9 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆61Updated 2 months ago
- Python bindings to the Tesseract API☆66Updated 8 years ago
- Vidscraper is a python library which provides a simple API for fetching video data from various web services and sites.☆62Updated 2 years ago
- Python's missing statistical Swiss Army knife☆15Updated 9 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 10 years ago
- A python framework to generate html and JavaScript from reusable and combine-able widgets.☆23Updated 2 years ago
- Utilities for working with data.☆20Updated 10 years ago
- Discover, analyze and present data from the web and mobile in meaninful ways☆82Updated 11 years ago
- Feedbuffer buffers RSS and Atom syndication feeds, that is to say it caches new feed entries until the news aggregator requests them and …☆19Updated 8 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 8 years ago
- Experiments mining image collections using OpenCV☆64Updated 9 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆31Updated 12 years ago
- Image comparison QA tool for digital preservation workflows.☆14Updated 10 years ago
- Simple type converters: make ints, floats, bools and dates from your strings!☆11Updated 8 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 11 years ago
- your elastic friend to start supervisord processes based on cpu cores available.☆16Updated 9 years ago
- A declarative data-migration package☆16Updated 5 months ago
- A bridge to the JS CoffeeScript compiler (EOL: Please use coffee command or webpack).☆82Updated 7 years ago
- an rdflib plugin to parse html5 microdata☆53Updated 13 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 8 years ago
- Django batch uploading☆9Updated 4 years ago
- Because a picture is worth a thousand words☆56Updated 9 years ago