ScientaNL / pdf-extractorLinks
Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata
☆107Updated 2 years ago
Alternatives and similar repositories for pdf-extractor
Users that are interested in pdf-extractor are comparing it to the libraries listed below
Sorting:
- nodejs lib for extracting data from PDF files☆246Updated 6 months ago
- Extracts email address from an arbitrary text input.☆64Updated last year
- Microsoft Word doc/docx to PDF conversion, client-side in-browser, using docx-wasm☆58Updated 6 years ago
- a javascript docx parser☆400Updated 11 months ago
- A tiny, highly-customizable, single-function javascript/typescript library that captures a webpage and returns a new lightweight, self-co…☆241Updated last year
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image …☆201Updated 3 weeks ago
- Fast Full Text Search based on BM25☆69Updated 3 years ago
- Annotation layer for pdf.js☆293Updated 2 weeks ago
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆132Updated last year
- Read data from a Word document using node.js☆149Updated last year
- Add image annotation to your web apps.☆152Updated 4 months ago
- 📰 Yet another Webassembly PDF renderer for node and the browser☆212Updated last year
- A wrapper for PDF Toolkit with streams and promises.☆143Updated last year
- Pure Javascript reader/writer for PowerPoint☆153Updated 10 years ago
- Generate PPTX files on the server-side with JavaScript.☆188Updated 2 months ago
- ☆194Updated 4 years ago
- Module for formatting and transforming text as you type in Quill☆73Updated 6 years ago
- Generates a printable paginated pdf from DOM node using HTML5 canvas and svg☆150Updated last year
- NPM package for creating a keyword array from a string and excluding stop words.☆202Updated last year
- Simple node package to convert a PDF into images.☆200Updated last year
- A utility for converting pdf to image and base64 format.☆498Updated 8 months ago
- Node module wrapper for WordNet dictionary.☆53Updated 3 years ago
- Simple tool for converting PDF to text using OCR☆100Updated 2 years ago
- A robust, strictly-typed Node.js and Browser library for parsing office files (docx, pptx, xlsx, odt, odp, ods, pdf, rtf). It produces a …☆276Updated 3 weeks ago
- JavaScript implementation of most Microsoft Excel formula functions☆105Updated 4 years ago
- javascript nodejs excel formula parser☆123Updated last year
- HTML to DOCX converter☆476Updated 9 months ago
- ☆305Updated this week
- Image annotation block for Airtable☆46Updated 4 years ago
- Multilingual tokenizer that automatically tags each token with its type☆65Updated 2 years ago