Dockerizing an Apache Spark Standalone Cluster
☆42Jun 29, 2022Updated 3 years ago
Alternatives and similar repositories for apache-spark-docker
Users that are interested in apache-spark-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The goal of this project is to identify students at risk of dropping out the school☆22May 7, 2021Updated 4 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆28Jun 13, 2022Updated 3 years ago
- Dockerizing and Consuming an Apache Livy environment☆13Jun 29, 2022Updated 3 years ago
- Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag☆23Sep 19, 2022Updated 3 years ago
- Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)☆14Jun 13, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Docker Big Data Tools: This docker-compose file is configured to run multiple nodes. This is a Hadoop Cluster that contains the necessary…☆31Jul 6, 2021Updated 4 years ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆123Jun 29, 2022Updated 3 years ago
- CI/CD platform using Jenkins, docker, Sonar, Nexus, Jmeter, Selenium, Ansible, AWX, Grafana, Prometheus, Zabbix, Stress-ng☆21Apr 26, 2026Updated last week
- Simple publisher and subscriber examples for Kombu and Pika with a RabbitMQ broker☆10Mar 23, 2018Updated 8 years ago
- Apache Spark on Apache Yarn 2.6.0 cluster Docker image☆11Oct 18, 2017Updated 8 years ago
- ☆11Jul 13, 2020Updated 5 years ago
- Lecture: Big Data☆14Oct 27, 2025Updated 6 months ago
- MongoDB movie data model, ETL loader, and queries.☆15Nov 6, 2020Updated 5 years ago
- 拉比克是一个开源大数据平台构建方案,已稳定应用于生产集群。融合Hadoop、Hive、Hbase、zookeeper等如CDH☆14Mar 11, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Morphometric taxonomy of Central Europe☆39Apr 7, 2026Updated 3 weeks ago
- Zeppelin docker☆16Nov 16, 2020Updated 5 years ago
- This project leverages Hadoop, Spark, SQL, and Hive for efficient data integration, transformation, warehousing, and analytics. It provid…☆23Sep 30, 2023Updated 2 years ago
- Just a boilerplate for PySpark and Flask☆36Aug 2, 2018Updated 7 years ago
- A simple hotel reservation system☆18Jan 20, 2022Updated 4 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- Group project for the WorldQuant University module, risk management.☆13Feb 3, 2019Updated 7 years ago
- ☆14Sep 14, 2021Updated 4 years ago
- ☆11May 28, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelin…☆11Mar 9, 2022Updated 4 years ago
- This repo contains the material and projects for Udacity Data science Nanodegree term 2☆11Dec 8, 2022Updated 3 years ago
- Tutorial and examples for using Apache Spark☆18Jul 21, 2017Updated 8 years ago
- Docker powered container for using Nginx as reverse-proxy in combination with an OpenVPN Client.☆11Jan 1, 2020Updated 6 years ago
- A data engineering pipeline for digital marketers.☆11Dec 21, 2018Updated 7 years ago
- PredictorFinc is a scalable supervised machine learning model the predicts stock price change through Decision Tree Regressor using data …☆12Sep 5, 2023Updated 2 years ago
- A fast and low memory requirement version of PointHop and PointHop++, which is built upon Apache Spark.☆10Jul 14, 2020Updated 5 years ago
- Hadoop, Hive and PrestoDB for deployment using Docker☆27Oct 21, 2025Updated 6 months ago
- A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models.☆11Jul 4, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Spark Projects for the Berkeley Data Science Course☆13Aug 12, 2015Updated 10 years ago
- While doing this course i worked on multiple technologies and python as base language for programming. These are all of the projects i di…☆20Sep 22, 2021Updated 4 years ago
- Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn☆50Dec 7, 2020Updated 5 years ago
- This sample demonstrates how to use the Microsoft Graph JavaScript SDK to access data in Office 365 from Office Add-ins.☆15May 26, 2025Updated 11 months ago
- ☆12May 24, 2022Updated 3 years ago
- MultiPaxos and Disk Paxos in TLA+ and PlusCal☆13Jan 23, 2023Updated 3 years ago
- Node-RED Flow (and web page example) for the LLaMA AI model☆11Jul 27, 2023Updated 2 years ago