scantailor / ScanTailor-CLI-GUILinks
Batch processing helper – GUI – for “ScanTailor-CLI” -- created by Csaba Kovacs
☆15Updated 9 years ago
Alternatives and similar repositories for ScanTailor-CLI-GUI
Users that are interested in ScanTailor-CLI-GUI are comparing it to the libraries listed below
Sorting:
- Building scantailor and its dependencies☆62Updated 2 years ago
- Fast PDF generation and compression. Deals with millions of pages daily.☆125Updated 3 weeks ago
- smoothscan is a tool to convert scanned text into a vectorized output form.☆67Updated 12 years ago
- Documentation and use cases for ALTO XML☆41Updated 7 years ago
- OCR for DjVu☆47Updated 3 years ago
- PDF to DjVu converter☆98Updated last year
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆195Updated 4 months ago
- Conversions between various OCR formats☆80Updated 2 years ago
- Erweiterung von Zotero für die Katalogisierung☆49Updated last year
- search interface for scholarly works☆86Updated last year
- Ergonomic line-by-line transcription of scanned text.☆53Updated 4 years ago
- Automatic de-keystoning for single camera DIY book scanners☆23Updated 9 years ago
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆55Updated last month
- Efficient hOCR tooling☆46Updated last month
- Automatic de-keystoning for single camera DIY book scanners.☆49Updated 5 years ago
- The CIS OCR PostCorrectionTool☆44Updated 2 years ago
- ALTO XML schema - latest and all former versions☆54Updated last year
- Crop And Splice Segments (of scanned pages)☆14Updated 6 years ago
- Tools to process books in a cloud based pipeline system☆62Updated 5 months ago
- Convert ALTO XML to plain text + minimal metadata☆17Updated 11 months ago
- CDXJ Indexing of WARC/ARCs☆28Updated 10 months ago
- The hOCR Embedded OCR Workflow and Output Format☆74Updated last year
- Scripts to auto-OCR PDFs, translate output using publicly-available or DIY NLP translation models, and generate epub/PDF☆44Updated last year
- A free Windows graphical interface to the Tesseract 4.0 OCR engine.☆61Updated 3 years ago
- Format Identification for Digital Objects (FIDO) is a Python command-line tool to identify the file formats of digital objects. It is des…☆157Updated 6 months ago
- tesseractXplore a tesseract ease of use gui with full control☆24Updated 3 years ago
- Open Access PDF harvester☆42Updated last year
- CHM format converter☆98Updated 4 months ago
- Batch convert PDF files to text under Windows, using several text extraction methods or OCR☆35Updated 9 years ago
- postcorrection web☆12Updated 2 years ago