radkovo / Pdf2Dom

Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTML file or further processed. A command-line utility for converting the PDF documents to HTML is included in the distribution package. Pdf2Dom may be also used as an independent Java library with a standard DOM …
180Updated 2 years ago

Alternatives and similar repositories for Pdf2Dom:

Users that are interested in Pdf2Dom are comparing it to the libraries listed below