rajbot / autocropLinks
This is a side project from 2008. This package contains a tool for automatically cropping and deskewing images of book pages captured by an Internet Archive Scribe bookscanner.
☆28Updated 12 years ago
Alternatives and similar repositories for autocrop
Users that are interested in autocrop are comparing it to the libraries listed below
Sorting:
- Document Imaging Archive System. Home document imaging, with OCR. Scan documents (with SANE) or import ODF documents, assign tags. Use op…☆25Updated 10 years ago
- A DSL to build Lucene text queries in Python.☆38Updated 9 years ago
- Backend part of Paperwork (Python API, no UI)☆18Updated 7 years ago
- OCR for DjVu☆47Updated 3 years ago
- Language checker and hyphenator extension for LibreOffice☆12Updated 6 years ago
- Some convenient natural language tools that build on NLTK.☆85Updated 11 years ago
- Python bindings to the Tesseract API☆66Updated 9 years ago
- A Python implementation of the Double Metaphone algorithm☆61Updated 15 years ago
- Focused Crawler for VT's CTRNet☆10Updated 12 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆30Updated 13 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 4 years ago
- Simple to use python library for Buffer App☆23Updated 3 years ago
- Image processing and image analysis software. (Mirror of source)☆21Updated 14 years ago
- An expandable and scalable OCR pipeline☆89Updated 8 years ago
- Find which links on a web page are pagination links☆29Updated 9 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Updated 11 years ago
- A MediaWiki-to-HTML parser for Python.☆54Updated 6 years ago
- Lightweight, multilingual natural language processing☆63Updated 12 years ago
- Data science tools from Moz☆23Updated 9 years ago
- Internal Stack Exchange☆26Updated 10 years ago
- A skip dict is a Python dictionary which is permanently sorted by value.☆19Updated 11 years ago
- Smart progressbar with multiple backends supporting both explicit updating and tqdm-style iterable-wrapping☆10Updated 9 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Updated 9 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- Attempts to determine the natural language of a selection of Unicode (utf-8) text (a clone of http://code.google.com/p/guess-language wit…☆48Updated 15 years ago
- ... just because nltk is too heavy☆35Updated 15 years ago
- Pure Python imaging library with Python 2.6, 2.7, 3.1+ support☆246Updated 12 years ago
- A simple, quick, powerful web framework☆184Updated 7 years ago
- This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet…☆31Updated last week
- Convert URL's to a normalized unicode format☆67Updated 8 years ago