harshankur / officeParserLinks
A robust, strictly-typed Node.js and Browser library for parsing office files (docx, pptx, xlsx, odt, odp, ods, pdf, rtf). It produces a clean, hierarchical Abstract Syntax Tree (AST) with rich metadata, text formatting, and full attachment support.
☆271Updated 3 weeks ago
Alternatives and similar repositories for officeParser
Users that are interested in officeParser are comparing it to the libraries listed below
Sorting:
- Yet another library to extract text from MS Office and PDF files☆84Updated last month
- pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image …☆201Updated 2 weeks ago
- Generate PPTX files on the server-side with JavaScript.☆187Updated 2 months ago
- Library Convert PDF to PNG☆168Updated last month
- Typescript wrapper for the PDFium library, works in browser and node.js☆154Updated 3 weeks ago
- Generate vector embeddings in NodeJS☆172Updated last month
- Create PowerPoint presentations with React☆202Updated 11 months ago
- Simple node package to convert a PDF into images.☆200Updated last year
- HTML to DOCX converter☆476Updated 9 months ago
- ☆157Updated 2 years ago
- Parse incomplete json text in best-effort manner☆272Updated 6 months ago
- Export a prosemirror document to a Microsoft Word file, using docx.☆156Updated 6 months ago
- Fast HTML to markdown converter for NodeJS or the browser☆249Updated 2 months ago
- 📰 Yet another Webassembly PDF renderer for node and the browser☆212Updated last year
- Parse partial JSON generated by LLM☆211Updated 5 months ago
- A simple vector database built on idb☆104Updated 2 years ago
- ☆305Updated this week
- 📃📸 Converts PDFs to images in nodejs☆136Updated 4 months ago
- Node.js bindings for faiss☆138Updated 2 years ago
- Muhammara a node module with c/cpp bindings to modify PDF with js for node or electron (based/replacement on/of galkhana/hummusjs)☆293Updated last week
- Node.js bindings for OpenAI's Whisper. (C++ CPU version by ggerganov)☆295Updated last year
- Parser to convert PPTX to JSON format☆92Updated 3 years ago
- nodejs lib for extracting data from PDF files☆246Updated 6 months ago
- Example of drag-n-drop snippets in Tiptap. See demo-video for more info!☆110Updated 3 years ago
- ☆153Updated 11 months ago
- A NodeJS RAG framework to easily work with LLMs and embeddings☆599Updated 2 months ago
- A lightweight Typescript library that interacts with Gotenberg's different modules to convert a variety of document formats to PDF files.☆155Updated this week
- Asynchronous Node.js wrapper for the Poppler PDF rendering library☆236Updated 3 weeks ago
- Streaming, source-agnostic EventSource/Server-Sent Events parser☆450Updated last month
- Column extension for tiptap v2☆111Updated 2 years ago