indix / web-auto-extractorLinks
Automatically extracts structured information from webpages
☆109Updated 2 years ago
Alternatives and similar repositories for web-auto-extractor
Users that are interested in web-auto-extractor are comparing it to the libraries listed below
Sorting:
- A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and …☆55Updated last year
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Updated 3 years ago
- Friendly web crawler for x-ray☆44Updated 2 years ago
- A suite of modules for text analysis, including simple analysis, nGrams, and TFIDF analysis☆48Updated 4 years ago
- Freeform Street Address Parser☆95Updated 2 years ago
- Deprecated plugin to detect sentiment: use `words/polarity`☆97Updated 7 months ago
- Node wrapper around FastText Library☆57Updated 2 years ago
- Cheerio based microdata parser☆57Updated 3 years ago
- Nodejs module for Extracting Concepts from text.☆10Updated last year
- Tokenize paragraphs into sentences, and smaller tokens.☆48Updated last year
- This stemmming module for Node.js provides stemming capability for a variety of languages using Dr. M.F. Porter's Snowball API.☆51Updated 2 months ago
- Scrape & parse a webpage to return a JSON with found microdata (schema.org)☆43Updated 7 years ago
- Parser for robots.txt for node.js☆67Updated 4 years ago
- A nodejs Scraping Utility for lazy people. MIT Licensed☆44Updated 3 years ago
- Helps to extract shortest optimal css-selector and multi-selector.☆26Updated 7 years ago
- schema.org in JS (work in progress)☆44Updated 2 years ago
- English NLP for Node.js and the browser.☆87Updated last year
- NodeJS bindings to libpostal for fast international address parsing/normalization☆231Updated 4 months ago
- Create HTML snippets/embeds from URLs using info from oEmbed, Open Graph, meta tags.☆66Updated last year
- Give me your coordinates and I'll tell you where the nearest cities are.☆46Updated 4 years ago
- sandcrawler.js - the server-side scraping companion.☆107Updated 9 years ago
- A node.js wrapper for Boilerpipe, an excellent Java library for boilerplate removal and fulltext extraction from HTML pages.☆52Updated 7 years ago
- Node library to extract keywords from text☆58Updated 9 years ago
- A JS Library that compares two DOM Nodes and outputs what changed between the two.☆154Updated 8 years ago
- Declarative JSON-to-JSON mapper .. shape-shift JSON files with Ditto 👻☆74Updated 2 years ago
- Creates screencasting-like gifs of page-scrolls☆16Updated 2 years ago
- LDA-Based Topic Modelling in Javascript☆44Updated 10 years ago
- Extracts email address from an arbitrary text input.☆62Updated 4 months ago
- A lightweight JavaScript client library for the Wikimedia Pageviews API for Wikipedia and various of its sister projects for Node.js and …☆27Updated 4 years ago
- Nodejs text sumarization☆54Updated 11 years ago