ScientaNL / pdf-extractorLinks
Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata
☆100Updated 2 years ago
Alternatives and similar repositories for pdf-extractor
Users that are interested in pdf-extractor are comparing it to the libraries listed below
Sorting:
- nodejs lib for extracting data from PDF files☆241Updated last month
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image …☆193Updated last month
- Annotation layer for pdf.js☆288Updated 11 months ago
- Extracts email address from an arbitrary text input.☆64Updated 7 months ago
- Microsoft Word doc/docx to PDF conversion, client-side in-browser, using docx-wasm☆58Updated 6 years ago
- A tiny, highly-customizable, single-function javascript/typescript library that captures a webpage and returns a new lightweight, self-co…☆237Updated last year
- 📰 Yet another Webassembly PDF renderer for node and the browser☆204Updated last year
- Add image annotation to your web apps.☆154Updated last month
- Fast Full Text Search based on BM25☆65Updated 2 years ago
- Generate PPTX files on the server-side with JavaScript.☆180Updated last year
- ☆192Updated 4 years ago
- Simple node package to convert a PDF into images.☆197Updated 10 months ago
- A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..☆235Updated 2 months ago
- a javascript docx parser☆387Updated 7 months ago
- Asynchronous Node.js wrapper for the Poppler PDF rendering library☆228Updated last week
- Generates a printable paginated pdf from DOM node using HTML5 canvas and svg☆148Updated last year
- A utility for converting pdf to image and base64 format.☆481Updated 3 months ago
- A wrapper for PDF Toolkit with streams and promises.☆143Updated last year
- ☆296Updated this week
- Module for formatting and transforming text as you type in Quill☆71Updated 6 years ago
- Muhammara a node module with c/cpp bindings to modify PDF with js for node or electron (based/replacement on/of galkhana/hummusjs)☆275Updated last month
- A simple JS/TS client for interacting with a Gotenberg API☆115Updated last year
- Get text content from any file☆64Updated last year
- HTML5 Canvas implementation for NodeJS backed by Puppeteer☆64Updated 2 years ago
- Javascript library for creating annotations in PDF documents☆613Updated 2 years ago
- Pure Javascript reader/writer for PowerPoint☆148Updated 9 years ago
- Yet another library to extract text from MS Office and PDF files☆81Updated last year
- Multilingual tokenizer that automatically tags each token with its type☆62Updated 2 years ago
- PDF.js-based PDF files viewer with annotation support☆97Updated last year
- NPM package for creating a keyword array from a string and excluding stop words.☆200Updated last year