ahkimkoo / node-article-extractorLinks
Automatically extract body content (and other cool stuff) from an html document. based on https://github.com/ageitgey/node-unfluff, but support Chinese.
☆17Updated 4 years ago
Alternatives and similar repositories for node-article-extractor
Users that are interested in node-article-extractor are comparing it to the libraries listed below
Sorting:
- Automatically extract body content (and other cool stuff) from an html document☆2,163Updated 2 years ago
- node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!☆1,692Updated last month
- Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.☆346Updated 7 years ago
- plugin to extract keywords and key-phrases☆337Updated last year
- NPM package for creating a keyword array from a string and excluding stop words.☆202Updated last year
- Machine learning based text classification in JavaScript using n-grams and cosine similarity☆134Updated last year
- Node module that summarizes text using a naive summarization algorithm☆770Updated 2 weeks ago
- fasttag part of speech tagger javascript implementation☆280Updated 5 years ago
- Use puppeteer to test and control your electron application.☆357Updated 2 years ago
- A nodejs module for converting pdf into image file☆75Updated 4 years ago
- A module for node.js and the browser that takes in text and strips it of stopwords☆261Updated 2 weeks ago
- text-to-png generator for Node.js☆193Updated 3 years ago
- Generate EPUB books from HTML with simple API in Node.js.☆454Updated 2 years ago
- Part-of-speech utilities for node.js based on the WordNet database.☆477Updated 3 years ago
- Get all urls in a string☆372Updated 2 years ago
- Extract colors from GIF, PNG, JPG, and SVG images☆353Updated 3 years ago
- Advanced html to text converter☆1,686Updated 2 years ago
- Sentence Boundary Detection in javascript for node. http://tessmore.github.io/sbd/☆221Updated 2 years ago
- A persistent, network resilient, full text search library for the browser and Node.js☆1,426Updated 10 months ago
- PDF to HTML (pdf2htmlEX) shell wrapper pdftohtmljs☆146Updated 3 years ago
- Robust RSS, Atom, and RDF feed parsing in Node.js☆1,979Updated 2 years ago
- Node.js module for high performance creation, modification and parsing of PDF files and streams☆1,174Updated last week
- Get image size without full download. Supported image types: JPG, GIF, PNG, WebP, BMP, TIFF, SVG, PSD, ICO.☆1,017Updated 2 years ago
- A Node.js module to search and scrape Google.☆456Updated 7 years ago
- 🇫🇷 NodeJS language detection library using n-gram☆410Updated 5 years ago
- Native node.js printer☆1,567Updated 3 years ago
- A dead-simple promise wrapper for nedb.☆300Updated last week
- Better Queue for NodeJS☆546Updated last year
- Read data from a Word document using node.js☆149Updated last year
- A nodejs package that returns base64 image data for a path's icon.☆33Updated 7 years ago