witwall / pdf2htmlEXLinks
Convert PDF to HTML without losing text or format.
☆21Updated 10 years ago
Alternatives and similar repositories for pdf2htmlEX
Users that are interested in pdf2htmlEX are comparing it to the libraries listed below
Sorting:
- Python library for manipulating Open Packaging Convention (OPC) files like .docx, .pptx, and .xslx☆46Updated 8 years ago
- a quick and dirty script to convert a Word (docx) document to html.☆53Updated 4 years ago
- Auto complete plugin from dictionary with no external dependencies☆468Updated 7 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- A library for extracting tables from PDF files☆92Updated 5 years ago
- Recipes for calibre☆69Updated 12 years ago
- A pair of scripts to download videos and subtitles for the TED Talks (http://www.ted.com)☆42Updated 11 years ago
- Personal clone of Poppler, official repository is here: https://gitlab.freedesktop.org/poppler/poppler☆130Updated 7 years ago
- A natural language date parser. (Python version of chrono.js)☆25Updated 2 months ago
- A library for extracting tables from PDF files☆89Updated 11 years ago
- Mouse gesture application for Windows☆50Updated last year
- 1-Click to make ePUB, MOBI, PDF with Word Addin☆28Updated 10 years ago
- An extendable docx file format parser and converter☆192Updated 2 months ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Python client for Docverter service (pandoc as a service)☆17Updated 7 years ago
- yael (Yet Another EPUB Library) is a Python library for reading, manipulating, and writing EPUB 2/3 files☆18Updated 10 years ago
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Updated 4 months ago
- Plugin to use rich text in Annotator☆30Updated 10 years ago
- Wandora is a general purpose information extraction, management and publishing application based on Topic Maps and Java.☆132Updated last year
- Distributed text analysis suite based on Celery☆96Updated 2 years ago
- Command-line tool for exploring and diagnosing problems with Microsoft Office Open XML files (.docx, .pptx, .xlsx)☆54Updated 10 months ago
- A simple viewer and inspection tool for text boxes in PDF documents☆95Updated 3 years ago
- jQuery XPath plugin (with full XPath 2.0 language support)☆179Updated 3 years ago
- Custom Blockly blocks for SumoRobot programming in Python☆12Updated 6 years ago
- A small framework taking over the manual training process described in the Tesseract3 Wiki: https://code.google.com/p/tesseract-ocr/wiki/…☆132Updated 2 years ago
- Artificial Intelligence Knowledge Information Framework☆55Updated 2 years ago
- Attempts to determine the natural language of a selection of Unicode (utf-8) text (a clone of http://code.google.com/p/guess-language wit…☆48Updated 15 years ago
- An online annotation platform for teaching and learning in the humanities.☆108Updated 2 weeks ago
- WordWanderer – take your text for a walk☆12Updated 6 years ago
- NO LONGER MAINTAINED A library for working with time and date series in Python☆45Updated 6 years ago