Tjatse / spider2
A 2nd generation spider to crawl any article site, automatic read title and article.
☆43Updated 9 years ago
Alternatives and similar repositories for spider2:
Users that are interested in spider2 are comparing it to the libraries listed below
- Friendly web crawler for x-ray☆44Updated 2 years ago
- Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they a…☆41Updated 8 years ago
- x-ray's selector parser.☆16Updated 9 years ago
- Simhash implementation in Javascript☆38Updated 7 years ago
- Extract a list of keywords from a website, sorted by word count.☆51Updated 8 years ago
- NodeJS Named Entity Recognition, using Stanford NER (easy install)☆40Updated 7 years ago
- A node.js wrapper for Boilerpipe, an excellent Java library for boilerplate removal and fulltext extraction from HTML pages.☆52Updated 7 years ago
- Qool, a leveldb backed Queue☆42Updated 8 years ago
- Extract meta-data from a html string. It extracts the body, title, meta-tags and first headlines to a object to push them to a search ind…☆13Updated 8 years ago
- Redis Message Connector☆14Updated 5 years ago
- A web crawler/scraper/spider for nodejs☆66Updated 7 years ago
- Token bucket based HTTP request throttle for Node.js☆16Updated 8 years ago
- Extract the content of any web page by using various content extractor libraries.☆10Updated 9 years ago
- Take screenshots☆40Updated 2 years ago
- phantom driver for x-ray.☆111Updated 8 years ago
- Distributed job queue for node☆120Updated 9 years ago
- Hosted viewer for documentation.js JSON output.☆34Updated 7 years ago
- Redis-based task queue library inspired by Celery and Kue.☆56Updated 10 years ago
- a lightweight proxy that lets you to drive phantomjs from node.☆136Updated 10 years ago
- A cache connector for Redis☆21Updated 3 months ago
- PageRank calculation for ngraph.graph☆28Updated last week
- Node wrapper around FastText Library☆57Updated last year
- NLP utilities in javascript and coffeescript☆38Updated 11 years ago
- Nodejs Client for Cayley☆57Updated 8 years ago
- A NodeJS library to keep an eye on your memory usage, and discover and isolate leaks.☆12Updated 8 years ago
- Easily query social media likes/followers without tokens (for node)☆16Updated 5 years ago
- Linear regression with Gradient descent package for NPM.☆46Updated 11 years ago
- sandcrawler.js - the server-side scraping companion.☆107Updated 9 years ago
- node.js wrapper for the Diffbot API (article and frontpage)☆35Updated 9 years ago
- The selection parser for x-ray. Aiming to bring structure to the web.☆20Updated 9 years ago