shebinleo / pdf2html
pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.
☆154Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for pdf2html
- React component for ONLYOFFICE Document Server☆34Updated 5 months ago
- Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata☆90Updated last year
- A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..☆148Updated last week
- 📰 Yet another Webassembly PDF renderer for node and the browser☆177Updated 4 months ago
- Pure Javascript reader/writer for PowerPoint☆130Updated 9 years ago
- ☆266Updated last month
- Get text content from any file☆62Updated 3 months ago
- Simple node package to convert a PDF into images.☆180Updated last month
- Muhammara a node module with c/cpp bindings to modify PDF with js for node or electron (based/replacement on/of galkhana/hummusjs)☆234Updated last month
- nodejs lib for extracting data from PDF files☆213Updated 6 months ago
- Interactive PPTX slide viewer☆37Updated 6 years ago
- a javascript docx parser☆362Updated 2 months ago
- HTML to DOCX converter☆391Updated 3 months ago
- Annotation layer for pdf.js☆268Updated last month
- Asynchronous node.js wrapper for the Poppler PDF rendering library☆186Updated last week
- [WIP] Web word processor for 2Tale Writer's Portal.☆115Updated last year
- A customizable math keyboard for React☆68Updated 3 weeks ago
- 🚜 Parse text and tables from PDF files.☆633Updated 2 weeks ago
- Yet another library to extract text from MS Office and PDF files☆62Updated 3 months ago
- PDF.js-based PDF files viewer with annotation support☆78Updated 3 months ago
- 📃📸 Converts PDFs to images in nodejs☆83Updated last week
- A NPM Utility program to convert office documents (documents/excel/presentations) into PDF/HTML☆38Updated 4 years ago
- Connect HTML elements with an arrow☆69Updated 8 months ago
- pdf2table is a node.js library that attempts to extract tables from a pdf.☆36Updated 6 months ago
- Emscripten port of Tesseract C++ API☆159Updated 2 months ago
- Parser to convert PPTX to JSON format☆86Updated last year
- A standalone rich text editor based on the 2d canvas☆41Updated 7 years ago
- WebViewer UI built in React☆415Updated this week
- Export a prosemirror document to a Microsoft Word file, using docx.☆108Updated 2 months ago
- Javascript library for creating annotations in PDF documents☆553Updated last year