tilakpatidar / bot-marvinLinks

Highly scalable crawler with best features.

☆11

Alternatives and similar repositories for bot-marvin

Users that are interested in bot-marvin are comparing it to the libraries listed below

Sorting:

HuddleEng / puppeteer-extensions
Convenience functions for the Puppeteer
☆25Updated 2 years ago
rrweb-io / rrweb-chrome-extension
The chrome extension of rrweb which helps to run rrweb on any website out of box
☆20Updated 2 years ago
hfreire / perseverance
Make your functions resilient and fail-fast to failures or delays
☆13Updated last year
Tjatse / spider2
A 2nd generation spider to crawl any article site, automatic read title and article.
☆43Updated 9 years ago
christophebe / check-domain
A simple component to check the status of a domain (whois, availability, expired, PR, TrustFlow, ...)
☆33Updated 8 years ago
JamieMason / image-optimisation-tools-comparison
A Benchmarking Suite for popular Image Optimisation Tools
☆28Updated 4 years ago
jsnomad / Google-Scraper
Extract links from Google SERP
☆48Updated 8 years ago
andreasgal / predict.js
Predictive text in JavaScript
☆29Updated 12 years ago
medialab / sandcrawler
sandcrawler.js - the server-side scraping companion.
☆107Updated 9 years ago
transitive-bullshit / puppeteer-render-text
Robust text renderer using headless chrome.
☆66Updated last year
mpneuried / html-extractor
Extract meta-data from a html string. It extracts the body, title, meta-tags and first headlines to a object to push them to a search ind…
☆13Updated 9 years ago
itemsapi / elasticitems
Higher level client for Elasticsearch written in Node.js oriented on facets and simplicity
☆20Updated 5 months ago
matthewmueller / x-ray-crawler
Friendly web crawler for x-ray
☆44Updated 2 years ago
nodeca / embedza
Create HTML snippets/embeds from URLs using info from oEmbed, Open Graph, meta tags.
☆66Updated 2 years ago
winkjs / wink-naive-bayes-text-classifier
Naive Bayes Text Classifier
☆40Updated 4 months ago
blakeembrey / node-htmlmetaparser
A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and …
☆55Updated last year
googlearchive / dev-video-search
Prototype API and sample app for searching Google developer videos
☆13Updated 9 years ago
velocityzen / meta-extractor
Super simple and fast html page meta data extractor with low memory footprint
☆36Updated 2 years ago
danielnieto / scrapman
Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
☆22Updated 7 years ago
ChristianRich / phone-number-extractor
Identifies and extracts phone numbers from arbitrary text
☆39Updated 8 years ago
benfoxall / tweets
☆33Updated 11 years ago
vtempest / bypasscors
Bypass CORS (Cross-Origin Resource Sharing) get HTML from external domains and make your own API
☆14Updated 7 years ago
crawlbase / proxycrawl-node
ProxyCrawl Node library for scraping and crawling
☆23Updated 2 years ago
joewhite86 / proxy-rotator
Simple proxy rotation service
☆30Updated 9 years ago
transitive-bullshit / puppeteer-render-text-cli
CLI for rendering text with headless chrome.
☆11Updated 5 years ago
pdehaan / summarizer
Scrapes a remote page and creates a summary with statistics
☆39Updated 10 years ago
muety / http2-serverpush-proxy
A simple standalone reverse proxy that automatically enables server-push for assets related to a HTTP response.
☆24Updated 8 years ago
syzer / sentiment-analyser
ML that can extract german and english sentiment
☆36Updated 4 years ago
wingify / dom-comparator
A JS Library that compares two DOM Nodes and outputs what changed between the two.
☆155Updated 9 years ago
cmawhorter / hmmac
Flexible nodejs HMAC authentication module for express/connect and beyond
☆36Updated 5 years ago