rajbot / autocrop
This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by an Internet Archive Scribe bookscanner.
☆28Updated 11 years ago
Alternatives and similar repositories for autocrop:
Users that are interested in autocrop are comparing it to the libraries listed below
- This a module to extract RDF from an HTML5 page annotated with microdata. The module implements the algorithm defined and published by th…☆44Updated 2 years ago
- Python library implementing the ISO/IEC 26300 OpenDocument Format standard (ODF)☆53Updated 4 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- A simple PDF transcription project for PyBossa☆19Updated 9 years ago
- Experiments mining image collections using OpenCV☆64Updated 9 years ago
- Image comparison QA tool for digital preservation workflows.☆14Updated 10 years ago
- PIL-compatible interface for platform libraries such as GraphicsMagick, Aware or JAI.☆25Updated 7 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- yael (Yet Another EPUB Library) is a Python library for reading, manipulating, and writing EPUB 2/3 files☆18Updated 9 years ago
- All the reports and data powering http://weekly.hatnote.com☆12Updated this week
- a web based tool to monitor how your website content is used in wikipedia☆37Updated 4 years ago
- Handwritten optical character recognition☆25Updated 9 years ago
- Recognition Models for Kraken and CLSTM☆14Updated 5 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆205Updated 11 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 3 years ago
- A python framework to generate html and JavaScript from reusable and combine-able widgets.☆23Updated 2 years ago
- Utilities for working with data.☆20Updated 10 years ago
- a Simple API for RDF☆29Updated 15 years ago
- Code for several utilities for use with VIVO☆11Updated 12 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 9 years ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- Wikipedia citation tool for Google Books, New York Times, ISBN, DOI and more☆22Updated 8 years ago
- Data Store for Annotation Studio☆46Updated 2 years ago
- For code related to making ePub files☆40Updated 9 years ago
- Vidscraper is a python library which provides a simple API for fetching video data from various web services and sites.☆62Updated 2 years ago
- A django port of MySociety's FixMyStreet, maintained by VisibleGovernment.ca☆54Updated 13 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 10 years ago
- A set of tools for rotating, cropping, and binding the images from a scanned book into a PDF.☆19Updated 6 years ago
- recursively deduplicate a directory and write its contents to a new directory while remembering the old paths☆48Updated 4 years ago
- Markdown for Linked Data☆16Updated 10 years ago