ScientaNL / pdf-extractor
Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata
☆93Updated last year
Alternatives and similar repositories for pdf-extractor:
Users that are interested in pdf-extractor are comparing it to the libraries listed below
- A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..☆162Updated 2 months ago
- nodejs lib for extracting data from PDF files☆222Updated 9 months ago
- Annotation layer for pdf.js☆272Updated 4 months ago
- A rich-text editor using Prosemirror with React☆38Updated last year
- 📰 Yet another Webassembly PDF renderer for node and the browser☆182Updated 7 months ago
- Simple node package to convert a PDF into images.☆186Updated 3 months ago
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image …☆159Updated last month
- a javascript docx parser☆368Updated 4 months ago
- Asynchronous node.js wrapper for the Poppler PDF rendering library☆202Updated this week
- Pure Javascript reader/writer for PowerPoint☆137Updated 9 years ago
- Parser to convert PPTX to JSON format☆88Updated 2 years ago
- Generate PPTX files on the server-side with JavaScript.☆168Updated last year
- Extracts email address from an arbitrary text input.☆61Updated this week
- Node.js - Convert DOCX to PDF, PNG to PDF, get thumbnails for PDF, stream PDFs.☆81Updated 2 years ago
- Microsoft Word doc/docx to PDF conversion, client-side in-browser, using docx-wasm☆52Updated 5 years ago
- Fast Full Text Search based on BM25☆60Updated 2 years ago
- Javascript library for creating and manipulating Open XML Documents like docx, xlsx, etc. User can export grid data or images to open xml…☆28Updated last year
- A standalone rich text editor based on the 2d canvas☆41Updated 7 years ago
- Mongodb adapter for Yjs☆37Updated last year
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆124Updated 10 months ago
- Javascript library for creating annotations in PDF documents☆567Updated last year
- WebAssembly based Javascript bindings for google Compact Language Detector v3☆61Updated last year
- JavaScript implementation of most Microsoft Excel formula functions☆106Updated 3 years ago
- Parse a Powerpoint file to a Json☆40Updated 6 months ago
- Client and service for embedding highlights into PDF documents☆34Updated 2 years ago
- ☆186Updated 3 years ago
- Building PDFium for Web Assembly☆73Updated 4 years ago
- PDF to HTML (pdf2htmlEX) shell wrapper pdftohtmljs☆144Updated 2 years ago
- Tiptap 2 Extension for adding videos☆52Updated last month
- Get text content from any file☆63Updated 5 months ago