witwall / pdf2htmlEXLinks
Convert PDF to HTML without losing text or format.
☆21Updated 10 years ago
Alternatives and similar repositories for pdf2htmlEX
Users that are interested in pdf2htmlEX are comparing it to the libraries listed below
Sorting:
- PdfJs-Annotator is a proof of concept project that integrates AnnotatorJs (http://annotatorjs.org/) with the PdfJs (https://mozilla.githu…☆25Updated 5 years ago
- A library for extracting tables from PDF files☆92Updated 5 years ago
- ☆23Updated 2 years ago
- SQL beautifier for databases include but not limited to Oracle, SQL Server, DB2, Sybase, MySQL, PostgreSQL, Teradata.☆51Updated last year
- Batch convert PDF files to text under Windows, using several text extraction methods or OCR☆35Updated 10 years ago
- Web data extraction tool implemented as chrome extension with much more features☆47Updated 7 years ago
- A tree diagram (SVG) generator.☆83Updated 3 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆216Updated 6 years ago
- Tool for visualizing hOCR output from Tesseract (or other OCR engines that support hOCR).☆26Updated 11 years ago
- ☆80Updated 3 years ago
- HtmlClipper is a bookmarklet which lets you copy html sections of any web pages together with the attached css styles.☆67Updated 4 years ago
- Notepad++ plugin to run Python scripts☆40Updated 6 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- Extract tables from PDF pages.☆298Updated 5 years ago
- Foxit webpdf.js provides a world-class JavaScript library for viewing PDF files in web browsers.☆67Updated 5 years ago
- A simple viewer and inspection tool for text boxes in PDF documents☆96Updated 3 years ago
- jQuery based XML editor plugin.☆155Updated last year
- An online annotation platform for teaching and learning in the humanities.☆108Updated last week
- HTML-Tidy plugin for Notepad++. Uses tidy-html5 - http://github.com/w3c/tidy-html5☆116Updated 8 years ago
- Library to transform Chrome bookmarks to tags☆150Updated 6 years ago
- A library for extracting tables from PDF files☆89Updated 12 years ago
- Python/Flask-based website for text analysis workflow. Previous (stable) release is live at:☆122Updated last year
- Download, aggregate, and filter RSS feeds.☆67Updated 10 years ago
- Web service for implementing a large-scale translation memory☆92Updated 4 years ago
- Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.☆35Updated last year
- An expandable and scalable OCR pipeline☆89Updated 8 years ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆277Updated 3 years ago
- Recipes for calibre☆69Updated 12 years ago
- 1-Click to make ePUB, MOBI, PDF with Word Addin☆28Updated 11 years ago
- Table editor for creating complex tables in HTML☆38Updated 4 years ago