dbashford / textractLinks
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
☆1,685Updated 3 years ago
Alternatives and similar repositories for textract
Users that are interested in textract are comparing it to the libraries listed below
Sorting:
- A wrapper for the wkhtmltopdf HTML to PDF converter using WebKit☆616Updated 2 years ago
- converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.☆2,140Updated 2 weeks ago
- Node.js module for high performance creation, modification and parsing of PDF files and streams☆1,167Updated 2 months ago
- Node PDF Extract☆389Updated 2 years ago
- A persistent, network resilient, full text search library for the browser and Node.js☆1,420Updated 6 months ago
- 🚜 Parse text and tables from PDF files.☆692Updated 8 months ago
- A javascript library for defining recurring schedules and calculating future (or past) occurrences for them. Includes support for using …☆2,420Updated 7 years ago
- This repo isn't maintained anymore as phantomjs got dreprecated a long time ago. Please migrate to headless chrome/puppeteer.☆3,564Updated last year
- Automatically extract body content (and other cool stuff) from an html document☆2,158Updated 2 years ago
- Advanced html to text converter☆1,675Updated last year
- Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.☆2,529Updated 2 years ago
- Full featured CSV parser with simple api and tested against large datasets.☆4,213Updated last month
- Decode mime formatted e-mails☆1,640Updated last month
- CSV parser and formatter for node☆1,752Updated last week
- A simple wrapper for the Tesseract OCR package☆676Updated 5 years ago
- Node module that summarizes text using a naive summarization algorithm☆770Updated 11 months ago
- a streaming interface for archive generation☆2,911Updated this week
- An XML builder for node.js☆924Updated last year
- PDF manipulation in Node.js! Split, join, crop, read, extract, boil, mash, stick them in a stew.☆286Updated 7 months ago
- A Javascript implementation of zip for nodejs. Allows user to create or extract zip files both in memory or to/from disk☆2,137Updated 7 months ago
- Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.☆345Updated 7 years ago
- Node module to allow for easy Excel file creation☆1,370Updated 3 years ago
- Part-of-speech utilities for node.js based on the WordNet database.☆477Updated 2 years ago
- a javascript docx parser☆389Updated 7 months ago
- Download and extract files☆1,301Updated last year
- Streaming csv parser inspired by binary-csv that aims to be faster than everyone else☆1,482Updated 8 months ago
- Blazing fast and Comprehensive CSV Parser for Node.JS / Browser / Command Line.☆2,022Updated 2 years ago
- Convert json to csv with column titles☆2,728Updated 2 years ago
- HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium☆794Updated 4 months ago
- Natural language detection☆4,318Updated last year