indix / web-auto-extractor
Automatically extracts structured information from webpages
☆108Updated 2 years ago
Alternatives and similar repositories for web-auto-extractor:
Users that are interested in web-auto-extractor are comparing it to the libraries listed below
- A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and …☆53Updated last year
- A NodeJS implementation of the Rapid Automatic Keyword Extraction algorithm.☆102Updated last year
- Cheerio based microdata parser☆58Updated 3 years ago
- A suite of modules for text analysis, including simple analysis, nGrams, and TFIDF analysis☆49Updated 4 years ago
- PhantomJS resource pool based on generic-pool☆106Updated 5 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆69Updated 3 years ago
- Scrape & parse a webpage to return a JSON with found microdata (schema.org)☆43Updated 7 years ago
- Parser for robots.txt for node.js☆67Updated 3 years ago
- Node wrapper around FastText Library☆57Updated 2 years ago
- Freeform Street Address Parser☆95Updated last year
- NodeJS Named Entity Recognition, using Stanford NER (easy install)☆40Updated 7 years ago
- NodeJS bindings to libpostal for fast international address parsing/normalization☆228Updated last month
- Helps to extract shortest optimal css-selector and multi-selector.☆26Updated 7 years ago
- Friendly web crawler for x-ray☆44Updated 2 years ago
- mbox file parser for Node.js☆70Updated 4 years ago
- Maxmind GeoIP2 Web Services for Node.js☆47Updated last week
- Puppeteer resource pool based on generic-pool☆64Updated 5 years ago
- A node.js module to help identify browser sessions☆59Updated 3 weeks ago
- MetaData html scraper and parser for Node.js (supports Promises and callback style)☆171Updated 2 weeks ago
- Deprecated plugin to detect sentiment: use `words/polarity`☆97Updated 4 months ago
- Sample memory usage for your Node.js program and write the samples to a stream☆74Updated 4 years ago
- This stemmming module for Node.js provides stemming capability for a variety of languages using Dr. M.F. Porter's Snowball API.☆51Updated this week
- Tokenize paragraphs into sentences, and smaller tokens.☆49Updated last year
- ImageResolver.js does its best to determine the main image on a URL without loading all images.☆163Updated 7 years ago
- text mining utilities for Node.js☆141Updated 2 years ago
- A module for node.js and the browser that takes in text and strips it of stopwords☆244Updated 2 months ago
- schema.org in JS (work in progress)☆44Updated 2 years ago
- A url and referrer parsing library for node.☆73Updated last year
- Node module to interact with the gmail api☆153Updated 3 years ago
- A nodejs Scraping Utility for lazy people. MIT Licensed☆44Updated 2 years ago