ScientaNL / pdf-extractorLinks
Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata
☆105Updated 2 years ago
Alternatives and similar repositories for pdf-extractor
Users that are interested in pdf-extractor are comparing it to the libraries listed below
Sorting:
- nodejs lib for extracting data from PDF files☆245Updated 4 months ago
- Extracts email address from an arbitrary text input.☆64Updated 10 months ago
- Fast Full Text Search based on BM25☆69Updated 3 years ago
- Microsoft Word doc/docx to PDF conversion, client-side in-browser, using docx-wasm☆58Updated 6 years ago
- RFC 822 EML file format parser and builder☆96Updated 2 years ago
- A wrapper for PDF Toolkit with streams and promises.☆143Updated last year
- Read data from a Word document using node.js☆148Updated last year
- NPM package for creating a keyword array from a string and excluding stop words.☆200Updated last year
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆133Updated last year
- a javascript docx parser☆393Updated 9 months ago
- Generates a printable paginated pdf from DOM node using HTML5 canvas and svg☆149Updated last year
- Language agnostic named entity recognizer☆40Updated 2 years ago
- Annotation layer for pdf.js☆289Updated last year
- Muhammara a node module with c/cpp bindings to modify PDF with js for node or electron (based/replacement on/of galkhana/hummusjs)☆287Updated 2 weeks ago
- Machine learning based text classification in JavaScript using n-grams and cosine similarity☆132Updated last year
- Extract text from pdfs that contain searchable pdf text☆116Updated 6 years ago
- javascript nodejs excel formula parser☆123Updated last year
- Generate PPTX files on the server-side with JavaScript.☆187Updated 2 weeks ago
- Simple node package to convert a PDF into images.☆197Updated last year
- ☆194Updated 4 years ago
- HTML5 Canvas implementation for NodeJS backed by Puppeteer☆65Updated 2 years ago
- A module for node.js and the browser that takes in text and strips it of stopwords☆258Updated last month
- mention tool for editor.js☆26Updated 6 years ago
- Add image annotation to your web apps.☆152Updated 2 months ago
- A powerful PDF tool for NodeJS based on HummusJS.☆350Updated 2 years ago
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image …☆198Updated 2 weeks ago
- A simple JS/TS client for interacting with a Gotenberg API☆115Updated last year
- A high-performance in-memory convertor to convert svg to png/jpeg images for Node.☆167Updated 2 years ago
- Parser to convert PPTX to JSON format☆91Updated 2 years ago
- 🕷🚀 Scrapes/Crawls the logo from a provided url(s)/website for your Node.js applications.☆49Updated 2 years ago