dbashford / textractLinks
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
☆1,680Updated 2 years ago
Alternatives and similar repositories for textract
Users that are interested in textract are comparing it to the libraries listed below
Sorting:
- converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.☆2,118Updated this week
- Advanced html to text converter☆1,664Updated last year
- Standalone Office Open XML files (Microsoft Office 2007 and later) generator for Word (docx), PowerPoint (pptx) and Excell (xlsx) in java…