torakiki / sejda
An extendible and configurable PDF manipulation layer library written in java.
☆524Updated last month
Alternatives and similar repositories for sejda:
Users that are interested in sejda are comparing it to the libraries listed below
- PDF Command Line Tools binaries for Linux, Mac, Windows☆642Updated 3 weeks ago
- A post-processing tool for scanned sheets of paper.☆1,067Updated 9 months ago
- Textricator is a tool to extract text from documents and generate structured data.☆347Updated last month
- PDF to DjVu converter☆100Updated last year
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆284Updated last year
- ☆734Updated 4 months ago
- Industry supported, open source PDF/A validation library☆289Updated 2 weeks ago
- Adds text to PDF files using the cuneiform OCR software☆326Updated 4 years ago
- Extract tables from PDF files☆356Updated 8 years ago
- Fork of Briss (http://briss.sourceforge.net/), an application for cropping PDF files☆57Updated last year
- Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder☆86Updated last month
- save/convert web pages to a standalone editable html file for offline archive/view/edit/play/whatever☆504Updated 3 years ago
- ☆262Updated 8 years ago
- mirror of https://gitlab.mister-muffin.de/josch/img2pdf for Travis and appveyor CI☆535Updated 3 weeks ago
- small collection of python scripts for pdf manipulation☆95Updated last year
- ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones …☆1,247Updated last year
- ScanTailor Universal - a fork based on Enhanced+Featured+Master versions of ST☆210Updated 3 weeks ago
- pdf watermark removal library for academic papers☆543Updated 4 years ago
- RUPS is an acronym for Reading and Updating PDF Syntax. RUPS is a tool built on top of iText® that allows you to look inside a PDF docume…☆307Updated this week
- Invenio digital library framework☆638Updated 5 months ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆389Updated 8 months ago
- The backend code that powers Unpaywall. support@unpaywall.org☆334Updated 2 weeks ago
- web interface for recoll desktop search☆285Updated 4 years ago
- veraPDF GUI, CLI and installer☆83Updated 2 weeks ago
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆314Updated last year
- python app/framework for 'all things ISBN' including metadata, descriptions, covers...☆225Updated last year
- Métamorphose Renamer v2☆147Updated 5 years ago
- Library to transform Chrome bookmarks to tags☆150Updated 5 years ago
- Official DownThemAll! repository. Pull requests welcome.☆520Updated 3 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year