momer/nutch-selenium-grid-plugin

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/momer/nutch-selenium-grid-plugin)

momer / nutch-selenium-grid-plugin

A Nutch 2.2.1 plugin which allows users to shuffle off the responsibility for retrieving pages to a selenium hub/node spoke system. This allows Nutch to rely on Selenium/Firefox to fetch and load javascript/content; while keeping Nutch in charge of what it does best: crawling and further parsing.

☆16

Alternatives and similar repositories for nutch-selenium-grid-plugin

Users that are interested in nutch-selenium-grid-plugin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

momer / nutch-selenium
View on GitHub
☆28Jun 9, 2016Updated 10 years ago
meabed / nutch-cassandra-docker
View on GitHub
Nutch with Cassandra and Elasticsearch on Docker
☆17Oct 26, 2021Updated 4 years ago
BayanGroup / nutch-custom-search
View on GitHub
☆67Dec 11, 2016Updated 9 years ago
appsembler / edx_xblock_scorm
View on GitHub
XBlock to use SCORM content in Open edX. Main development in use_ssla_player branch, requires commercial SSLA player by JCA Solutions.
☆12Jun 21, 2023Updated 3 years ago
EOSVR / EOSVR
View on GitHub
EOSVR Introduction.
☆16Jul 28, 2019Updated 6 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
y001j / IoT_Gateway
View on GitHub
麦恩IoT Gateway 是基于高速数据总线的全新架构的高性能物联网数据网关平台，专为采集并预处理大规模设备数据而设计，基于Go语言构建，支持多协议设备接入、实时数据处理、智能规则引擎和多种聚合函数，提供完整的插件化架构和现代化Web管理界面，适用于工业智能和智慧城…
☆16Aug 25, 2025Updated 10 months ago
irespo / irespo
View on GitHub
☆17Sep 14, 2018Updated 7 years ago
LinuxWillWin1 / Assetto-Corsa-on-SteamDeck
View on GitHub
Assetto Corsa on the steamdeck is working once again, as you've probably noticed none of the online guides are working, so here is one th…
☆16Oct 1, 2023Updated 2 years ago
thammegowda / tika-ner-corenlp
View on GitHub
Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser
☆13Feb 26, 2022Updated 4 years ago
openedx / edx-search
View on GitHub
☆20Updated this week
mitll / vizlinc
View on GitHub
Vizlinc
☆15Jan 14, 2016Updated 10 years ago
joshua-decoder / thrax
View on GitHub
Hadoop-based tool for extraction of large scale synchronous grammars for paraphrasing and machine translation
☆15Dec 2, 2016Updated 9 years ago
AkeemMcLennon / docker-selenium-node-phantomjs
View on GitHub
A docker image of PhantomJS 2.0 / GhostDriver that's compatible with selenium grid hub
☆28Aug 9, 2016Updated 9 years ago
Tathagatd96 / Deep-Autoencoder-using-Tensorflow
View on GitHub
☆11Jan 16, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
USCDataScience / polar.usc.edu
View on GitHub
Polar USC activities related to NSF Polar CyberInfrastructure program at the University of Southern California
☆15Jan 15, 2023Updated 3 years ago
codemeta / codemeta-paper
View on GitHub
Codemeta paper.
☆10Jul 10, 2017Updated 9 years ago
pcodding / hadoop_ctakes
View on GitHub
Hadoop integration code for working with with Apache cTAKES
☆10Feb 11, 2014Updated 12 years ago
mattflax / dropwizard-tika-server
View on GitHub
A DropWizard wrapper around Apache Tika.
☆10Dec 22, 2016Updated 9 years ago
micwallace / HttpSocketAdaptor
View on GitHub
A Simple Http to Raw Socket Adapter for Android
☆12Aug 30, 2015Updated 10 years ago
chrismattmann / etllib
View on GitHub
This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading …
☆18Jan 27, 2024Updated 2 years ago
NCEAS / oss-2017
View on GitHub
OSS2017 - Open Science for Synthesis: Gulf Research Program
☆10May 12, 2019Updated 7 years ago
kwkou / 815
View on GitHub
☆10Jun 16, 2017Updated 9 years ago
gsh199449 / DistributedCrawler
View on GitHub
DistributeCrawler的Maven版
☆10Jun 20, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
thammegowda / tika-dl4j-spark-imgrec
View on GitHub
Image recognition on Spark cluster powered by Deeplearning4j and Apache Tika
☆14May 16, 2017Updated 9 years ago
vlall / Moses-API
View on GitHub
Simple RESTful API server running your own machine translation model. Docker image modified from mbartoli/easy-smt
☆11Apr 28, 2019Updated 7 years ago
nasa-jpl-memex / elwha
View on GitHub
Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…
☆17Sep 11, 2015Updated 10 years ago
stbrenner / autoproxy
View on GitHub
Autoproxy automatically detects proxies and stores them in the respective environment variables (e.g. http_proxy).
☆14Oct 2, 2016Updated 9 years ago
informatics-isi-edu / chaise
View on GitHub
An adaptive user interface for the Deriva platform.
☆10Jul 10, 2026Updated last week
gurgeous / simhilarity
View on GitHub
Measure text similarity using weighted ngrams.
☆18Feb 27, 2014Updated 12 years ago
learnsqr / cursoZf2
View on GitHub
ZF2 Skeleton App: Api client, Apigility, SocialAuth, etc.
☆10Dec 10, 2015Updated 10 years ago
Kitware / girder
View on GitHub
A data management platform for the web
☆11Updated this week
reorx / project_sketch
View on GitHub
A nerd's boilerplate for your Python project.
☆18Oct 15, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
TheTechnobear / MusicalHacks
View on GitHub
Repo for my musical hacks video series
☆10Jun 12, 2020Updated 6 years ago
chrismattmann / trec-dd-polar
View on GitHub
A dataset downloaded from the deep and scientific web across three major Polar data centers for use in research.
☆13Sep 8, 2017Updated 8 years ago
USCDataScience / AgePredictor
View on GitHub
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
☆18Jul 1, 2022Updated 4 years ago
apache / datasketches-pig
View on GitHub
Sketch adaptors for Pig.
☆10May 15, 2026Updated 2 months ago
ipfs-inactive / window.ipfs-fallback
View on GitHub
[DEPRECATED] Use ipfs-provider instead:
☆11May 13, 2020Updated 6 years ago
mattfullerton / tika-tesseract-docker
View on GitHub
Docker container to provide Apache Tika RESTful API
☆41Feb 12, 2016Updated 10 years ago
nasa-jpl-memex / weapons
View on GitHub
MEMEX Weapons Pilot for the illegal weapons domain.
☆15May 20, 2016Updated 10 years ago