ScientaNL / pdf-extractorLinks
Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata
☆100Updated 2 years ago
Alternatives and similar repositories for pdf-extractor
Users that are interested in pdf-extractor are comparing it to the libraries listed below
Sorting:
- nodejs lib for extracting data from PDF files☆234Updated last year
- A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..☆211Updated last week
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image …☆184Updated last week
- Extracts email address from an arbitrary text input.☆62Updated 5 months ago
- Simple node package to convert a PDF into images.☆194Updated 8 months ago
- Get text content from any file☆65Updated 10 months ago
- Asynchronous Node.js wrapper for the Poppler PDF rendering library☆219Updated this week
- Fast Full Text Search based on BM25☆63Updated 2 years ago
- Microsoft Word doc/docx to PDF conversion, client-side in-browser, using docx-wasm☆55Updated 6 years ago
- ☆189Updated 4 years ago
- javascript nodejs excel formula parser☆118Updated 10 months ago
- A wrapper for PDF Toolkit with streams and promises.☆142Updated last year
- Parser to convert PPTX to JSON format☆90Updated 2 years ago
- Annotation layer for pdf.js☆285Updated 9 months ago
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆131Updated last year
- mention tool for editor.js☆26Updated 5 years ago
- Generate PPTX files on the server-side with JavaScript.☆177Updated last year
- Muhammara a node module with c/cpp bindings to modify PDF with js for node or electron (based/replacement on/of galkhana/hummusjs)☆271Updated 6 months ago
- A tiny (< 100 LoC) library for trimming whitespace from a canvas element with no dependencies☆71Updated 5 years ago
- ☆289Updated 4 months ago
- Multilingual tokenizer that automatically tags each token with its type☆62Updated 2 years ago
- A module for node.js and the browser that takes in text and strips it of stopwords☆252Updated 3 weeks ago
- NPM package for creating a keyword array from a string and excluding stop words.☆200Updated last year
- Image annotation block for Airtable☆46Updated 4 years ago
- NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …☆48Updated 5 months ago
- A database and connection provider for Yjs based on Firestore (Firebase). 🔥 y-fire helps you create serverless collaborative web apps.☆65Updated 9 months ago
- Add image annotation to your web apps.☆153Updated 3 months ago
- Emscripten port of Tesseract C++ API☆174Updated 6 months ago
- SpellcheckerWasm is an extrememly fast spellchecker for WebAssembly based on SymSpell☆59Updated 2 years ago
- Undo history for ProseMirror☆51Updated 8 months ago