ScientaNL / pdf-extractorLinks
Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata
β104Updated 2 years ago
Alternatives and similar repositories for pdf-extractor
Users that are interested in pdf-extractor are comparing it to the libraries listed below
Sorting:
- nodejs lib for extracting data from PDF filesβ244Updated 3 months ago
- π° Yet another Webassembly PDF renderer for node and the browserβ209Updated last year
- a javascript docx parserβ392Updated 9 months ago
- Extracts email address from an arbitrary text input.β64Updated 9 months ago
- Annotation layer for pdf.jsβ289Updated last year
- NPM package for creating a keyword array from a string and excluding stop words.β199Updated last year
- Fast Full Text Search based on BM25β67Updated 2 years ago
- A wrapper for PDF Toolkit with streams and promises.β143Updated last year
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image β¦β197Updated 3 months ago
- Microsoft Word doc/docx to PDF conversion, client-side in-browser, using docx-wasmβ58Updated 6 years ago
- Asynchronous Node.js wrapper for the Poppler PDF rendering libraryβ232Updated this week
- A tiny, highly-customizable, single-function javascript/typescript library that captures a webpage and returns a new lightweight, self-coβ¦β240Updated last year
- Language agnostic named entity recognizerβ39Updated 2 years ago
- β299Updated 2 months ago
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.β132Updated last year
- A high-performance in-memory convertor to convert svg to png/jpeg images for Node.β167Updated 2 years ago
- Get text content from any fileβ64Updated last year
- Add image annotation to your web apps.β153Updated last month
- Read data from a Word document using node.jsβ148Updated last year
- javascript nodejs excel formula parserβ123Updated last year
- Pure Javascript reader/writer for PowerPointβ149Updated 10 years ago
- HTML to DOCX converterβ476Updated 7 months ago
- A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..β243Updated this week
- Simple node package to convert a PDF into images.β196Updated last year
- Open Graph, Twitter Card, Oembed preview. Shows visual cards that mimic link previews in Social Media like facebook, twitter, vk and otheβ¦β76Updated 2 years ago
- A module for node.js and the browser that takes in text and strips it of stopwordsβ257Updated 3 weeks ago
- Muhammara a node module with c/cpp bindings to modify PDF with js for node or electron (based/replacement on/of galkhana/hummusjs)β287Updated last month
- β301Updated 9 months ago
- RFC 822 EML file format parser and builderβ96Updated 2 years ago
- A utility for converting pdf to image and base64 format.β490Updated 5 months ago