Ansible roles to install an Spark Standalone cluster (HDFS/Spark/Jupyter Notebook) or Ambari based Spark cluster
☆61Jan 30, 2024Updated 2 years ago
Alternatives and similar repositories for ansible-spark-cluster
Users that are interested in ansible-spark-cluster are comparing it to the libraries listed below
Sorting:
- All Certification and preparation, examples & others☆12Oct 18, 2018Updated 7 years ago
- Hadoop Examples☆10Jul 1, 2022Updated 3 years ago
- Ansible Playbook to create LAMP in CentOS 7 with Apache, MySQL, PHP.☆10Dec 28, 2018Updated 7 years ago
- ☆11Dec 14, 2015Updated 10 years ago
- Ansible playbooks for deploying Hortonworks Data Platform and DataFlow using Ambari Blueprints☆249Apr 12, 2021Updated 4 years ago
- Apache Hadoop - Docker distribution based on CentOS 7 and Oracle Java 8☆12Feb 20, 2018Updated 8 years ago
- Projects from my Hadoop training sessions☆16Feb 22, 2018Updated 8 years ago
- Easily import a module and mock its dependencies in an isolated way.☆13May 19, 2022Updated 3 years ago
- Add gevent support to DataStax Python Driver for Apache Cassandra☆11Jun 10, 2020Updated 5 years ago
- ansible playbook to deploy cloudera hadoop components to the cluster☆53Sep 8, 2018Updated 7 years ago
- DOP是一个基于蓝鲸智云开发的数据管理工具,旨在简化各类大数据组件的日常运维操作、降低使用门槛、提高运维效率,目前支持Elasticsearch、Kafka、Hadoop。☆15Feb 28, 2023Updated 3 years ago
- ☆14Aug 24, 2021Updated 4 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Oct 11, 2021Updated 4 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆19Jul 16, 2019Updated 6 years ago
- Ansible roles to deploy Kubernetes, JupyterHub, Jupyter Enterprise Gateway and Spark on Kubernetes cluster☆37Dec 4, 2020Updated 5 years ago
- ☆18Apr 24, 2025Updated 10 months ago
- Spark and Python (PySpark) Examples☆39Jul 7, 2021Updated 4 years ago
- A Puppet module for installing, and configuring ScaleIO 2.0x data services components.☆20Feb 19, 2018Updated 8 years ago
- Following along with the Hive tutorial at StrataConf / HadoopWorld☆22Mar 22, 2019Updated 6 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Sep 8, 2016Updated 9 years ago
- A K8s-based infrastructure for analytics☆24Jan 15, 2020Updated 6 years ago
- Python API for Informatica PowerCenter (pmrep, pmcmd)☆21Sep 17, 2017Updated 8 years ago
- ☆26Dec 28, 2015Updated 10 years ago
- An Ansible collection for Cloudera Platform for on-premise and cloud Datahubs☆38Aug 26, 2025Updated 6 months ago
- Document and showcase how you can create Spark Applications which run inside Docker Containers using Apache Mesos.☆28Feb 25, 2016Updated 10 years ago
- Advanced block device testing/file system testing, targetting SNIA compatible reporting☆12Oct 15, 2025Updated 4 months ago
- AWS LocalStack + Spark Cluster + Zeppelin [Docker]☆10Jul 6, 2022Updated 3 years ago
- 📦 Starting box for Vagrant. Inside box Ubuntu 20.04 LTS with Git, Docker and Docker compose.☆19May 5, 2022Updated 3 years ago
- Dione - a Spark and HDFS indexing library☆52Oct 27, 2025Updated 4 months ago
- Spark with Scala example projects☆34Apr 17, 2019Updated 6 years ago
- This repository provides Dockerfiles to create a ScaleIO cluster on 3 Docker hosts.☆29Dec 17, 2015Updated 10 years ago
- My Study guide used to pass the CRT020 Spark Certification exam☆34Jan 6, 2020Updated 6 years ago
- Language for simplifying parameterized RTL design☆12Nov 6, 2024Updated last year
- Denoising GANs -- TensorFlow2 training code for Gaussian denoiser using the GAN framework.☆10Jan 6, 2022Updated 4 years ago
- NSI power site project☆17May 5, 2012Updated 13 years ago
- Lustre Repository with MS patches☆14Updated this week
- Parses BGP/AS data from multiple different sources☆11Dec 4, 2021Updated 4 years ago
- Python wrappers for the FirecREST API☆12Updated this week
- Apache NiFi metrics exporter for Prometheus☆32Mar 25, 2020Updated 5 years ago