xissy / node-boilerpipe
A node.js wrapper for Boilerpipe, an excellent Java library for boilerplate removal and fulltext extraction from HTML pages.
☆52Updated 7 years ago
Alternatives and similar repositories for node-boilerpipe:
Users that are interested in node-boilerpipe are comparing it to the libraries listed below
- Node.js module to extract and summarize html content☆42Updated 10 years ago
- A simple node.js wrapper for stanford-core-nlp.☆148Updated 7 years ago
- Martin Porter's stemmer for node.js☆100Updated 4 years ago
- Friendly web crawler for x-ray☆44Updated 2 years ago
- A simple node.js wrapper for Stanford CoreNLP.☆78Updated 2 years ago
- A PredictionIO 0.9+ client☆60Updated 6 years ago
- Redis time series statistics with Node.js☆181Updated 8 years ago
- NodeJS Named Entity Recognition, using Stanford NER (easy install)☆40Updated 7 years ago
- A simple-but-useful kNN library for NodeJS, comparing JSON Objects using Euclidean distances☆214Updated 9 years ago
- Node library to extract keywords from text☆58Updated 9 years ago
- phantom driver for x-ray.☆111Updated 8 years ago
- fetch & parse ATOM & RSS feeds with Node.js☆74Updated 6 years ago
- The selection parser for x-ray. Aiming to bring structure to the web.☆20Updated 9 years ago
- remote monitoring and debugging for socket.io☆451Updated 10 years ago
- A helper robot written in node javascript☆74Updated 13 years ago
- node.js wrapper for the Diffbot API (article and frontpage)☆35Updated 9 years ago
- Parser for robots.txt for node.js☆67Updated 3 years ago
- A 2nd generation spider to crawl any article site, automatic read title and article.☆43Updated 9 years ago
- Rate monitoring and limiting for express.js apps☆80Updated 9 years ago
- A web crawler/scraper/spider for nodejs☆66Updated 7 years ago
- Extract the content of any web page by using various content extractor libraries.☆10Updated 9 years ago
- WordNet Database files (previously WNdb)☆215Updated 5 years ago
- Tokenize paragraphs into sentences, and smaller tokens.☆49Updated last year
- REST interface for Redis-Simple-Message-Queue☆43Updated 8 years ago
- Redis adapter for SocketCluster☆45Updated 5 years ago
- sandcrawler.js - the server-side scraping companion.☆107Updated 9 years ago
- Watches for changes in MongoDB replication log.☆95Updated 9 years ago
- Google speech api wrapper for node☆86Updated 9 years ago
- Cheerio fork that uses parse5 as the underlying platform☆51Updated 4 years ago
- A simple way to rate limit how often a function is executed.☆77Updated last month