danigm / poppler
Personal clone of Poppler, official repository is here: https://gitlab.freedesktop.org/poppler/poppler
☆130Updated 6 years ago
Alternatives and similar repositories for poppler
Users that are interested in poppler are comparing it to the libraries listed below
Sorting:
- PoDoFo is a library to work with the PDF file format. The name comes from the first letter of PDF (Portable Document Format). A few tools…☆52Updated 10 years ago
- This is not the poppler repository. Please see https://poppler.freedesktop.org/☆53Updated 15 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆214Updated 5 years ago
- Convert a docx (OOXML) file to html. This project is deprecated in favor of https://github.com/OpenScienceFramework/pydocx☆45Updated 11 years ago
- pgmagick is a yet another boost.python based wrapper for GraphicsMagick/ImageMagick.☆154Updated 4 months ago
- Python script to do PDF OCR conversion using Tesseract☆375Updated last year
- CFFI-based cairo bindings for Python.☆209Updated 5 months ago
- ☆423Updated 10 years ago
- ☆163Updated 10 years ago
- cli for extracting text from PDF files (and maybe possibly tables)☆76Updated last month
- Extremely Naive Charset Analyser☆286Updated 7 months ago
- Wrapper for pdftohtml that tries to extract paragraph structure☆50Updated 6 years ago
- An open-source ODBC driver manager and SDK that facilitates the development of database-independent applications on linux, freebsd, unix …☆171Updated 4 months ago
- A library for extracting tables from PDF files☆89Updated 11 years ago
- mirrored from git://git.ghostscript.com/mupdf.git☆57Updated last year
- Linguistic Annotation and Visualization Tool for PDF Documents☆199Updated 5 years ago
- Fast multi-keyword search engine for text strings☆253Updated 8 months ago
- A toolbox for working with the Chinese language in Python☆150Updated 5 years ago
- A dependency-free C interface to the Mozilla Universal Character Set Detector☆67Updated 8 years ago
- Extract meaningful content from pdf and psd file, such as texts and images both linked into a common JSON string☆37Updated 7 years ago
- uchardet is an encoding detector library, which takes a sequence of bytes in an unknown character encoding and attempts to determine the …☆44Updated 11 months ago
- Run pdf2htmlEX in a Docker container.☆25Updated last year
- python module reading the StarDict dictionaries☆45Updated last year
- Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.☆130Updated 7 years ago
- Chinese segmentation library☆82Updated 14 years ago
- Python binding to libpoppler-qt5☆43Updated last year
- Microsoft (MS) EMF to SVG conversion library☆99Updated 8 months ago
- Scrapinghub Command Line Client☆133Updated 3 weeks ago
- compact_enc_det - Compact Encoding Detection☆229Updated last year
- Python API for Various DB-Backed Simhash Clusters☆64Updated 8 years ago