Vagrant project to spin up a cluster of 4 32-bit CentOS6.5 Linux virtual machines with Hadoop v2.6.0 and Spark v1.1.1
☆125Jan 31, 2016Updated 10 years ago
Alternatives and similar repositories for vagrant-hadoop-spark-cluster
Users that are interested in vagrant-hadoop-spark-cluster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Deploying apache-hadoop in a virtualized cluster as easy as 1-2-3.☆127Jan 16, 2017Updated 9 years ago
- ☆26Jan 2, 2024Updated 2 years ago
- Code materials for the MongoDB Spark Course☆13Jun 22, 2017Updated 8 years ago
- An example of bioinformatics and bigdata tools can playing nicely together☆14May 17, 2016Updated 9 years ago
- Cloudera CDH 5.4.0☆21Jul 5, 2015Updated 10 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Data ingestion examples☆11Feb 12, 2015Updated 11 years ago
- Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR☆34May 13, 2016Updated 9 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- A little demo how to bind an advanced data science algorithms to 4 different languages☆10Nov 6, 2018Updated 7 years ago
- MaveDB database web application☆13Nov 17, 2023Updated 2 years ago
- Mastering Apache Spark 2x, published by Packt☆17Jan 30, 2023Updated 3 years ago
- A virtual Hadoop cluster running CDH5☆103Jan 3, 2016Updated 10 years ago
- ☆16Jun 22, 2015Updated 10 years ago
- Scripts to setup Spark cluster (any version) in any Openstack environment with optional useful tools.☆31Oct 22, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A spark sbt blueprint to build your own spark apps off of (for cloud native runtime, see the kube/spark examples)☆57Jun 1, 2019Updated 6 years ago
- SQLAlchemy models and DDL and ERD generation from chop-dbhi/data-models style JSON endpoints.☆11May 22, 2023Updated 2 years ago
- Multiple node cluster on Docker for self development.☆92Jul 7, 2018Updated 7 years ago
- Running TPC-H on Apache Hive☆41Jul 15, 2019Updated 6 years ago
- Apache Zeppelin with support for SQL Server☆16Sep 25, 2017Updated 8 years ago
- This is a simple CEP Engine leveraging the Kafka Streams platform☆16Apr 25, 2017Updated 8 years ago
- ☆47May 1, 2017Updated 8 years ago
- Hadoop Examples☆10Jul 1, 2022Updated 3 years ago
- Jupyter notebooks and code for Intro to DL talk at Genesys☆14Aug 14, 2016Updated 9 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Notes on Lambda Architecture☆12Feb 9, 2018Updated 8 years ago
- Apache Spark and Apache Kafka integration example☆122Dec 21, 2017Updated 8 years ago
- Docker image for Dataiku Science Studio☆10Apr 20, 2017Updated 8 years ago
- Twitter Sentiment Analysis☆10Jul 20, 2015Updated 10 years ago
- Dockerfiles and scripts for Spark and Shark Docker images☆259Jun 19, 2014Updated 11 years ago
- Spotify's Luigi + Amazon's SWF integration☆16Feb 12, 2016Updated 10 years ago
- Simple employee cost/benefit model with plots. Supports a series of blog entries.☆71Nov 2, 2014Updated 11 years ago
- Ansible playbook that installs a Hadoop cluster, with HBase, Hive, Presto for analytics, and Ganglia, Smokeping, Fluentd, Elasticsearch a…☆418Sep 22, 2016Updated 9 years ago
- Information for setting up for the BerkeleyX Spark Intro MOOC, and lab assignments for the course☆346Mar 19, 2021Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Analaysis for the batch correction paper☆12Apr 26, 2025Updated 11 months ago
- Terraform script for launching multiple EMR clusters for training purposes.☆16Oct 30, 2025Updated 5 months ago
- Repository for Gephi Plugins maintained by the team. Each plugin has it's branch.☆13Nov 14, 2023Updated 2 years ago
- Ansible recipes for Berkeley Data Analytics Stack deployment☆17Aug 7, 2015Updated 10 years ago
- Storm Database Explorer - Developing Data Products course project.☆11May 3, 2017Updated 8 years ago
- ☆12Jan 19, 2024Updated 2 years ago
- Course materials in Udemy Apache Spark 2.0 + Python: Do Big Data Analytics & ML☆25Mar 14, 2017Updated 9 years ago