Dockerizing an Apache Spark Standalone Cluster
☆42Jun 29, 2022Updated 4 years ago
Alternatives and similar repositories for apache-spark-docker
Users that are interested in apache-spark-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The goal of this project is to identify students at risk of dropping out the school☆22May 7, 2021Updated 5 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆29Jun 13, 2022Updated 4 years ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆124Jun 29, 2022Updated 4 years ago
- Big Data infrastructure with Hadoop, Spark, Hive and NiFi deployed using Docker Compose. https://doi.org/10.5281/zenodo.18968438☆21Mar 11, 2026Updated 3 months ago
- CI/CD platform using Jenkins, docker, Sonar, Nexus, Jmeter, Selenium, Ansible, AWX, Grafana, Prometheus, Zabbix, Stress-ng☆20May 25, 2026Updated last month
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Apache Spark on Apache Yarn 2.6.0 cluster Docker image☆12Oct 18, 2017Updated 8 years ago
- ☆10Jun 3, 2023Updated 3 years ago
- An end-to-end data engineering pipeline to create a dashboard for the latest content on the r/Stocks subreddit☆20Aug 5, 2022Updated 3 years ago
- ☆13Jun 3, 2022Updated 4 years ago
- My Wardley Mapping Stuff (see: https://medium.com/wardleymaps)☆13May 3, 2021Updated 5 years ago
- ☆21Jul 3, 2019Updated 7 years ago
- 拉比克是一个开源大数据平台构建方案,已稳定应用于生产集群。融合Hadoop、Hive、Hbase、zookeeper等如CDH☆14Mar 11, 2019Updated 7 years ago
- zdh系列-基于java的经营风控引擎☆13Jun 7, 2026Updated 3 weeks ago
- This project leverages Hadoop, Spark, SQL, and Hive for efficient data integration, transformation, warehousing, and analytics. It provid…☆23Sep 30, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆15Dec 2, 2020Updated 5 years ago
- Just a boilerplate for PySpark and Flask☆36Aug 2, 2018Updated 7 years ago
- Repository for Apache Spark course at Team Data Science☆17Oct 23, 2020Updated 5 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- ☆14Sep 14, 2021Updated 4 years ago
- Distributed stock price forecasting system to predict S&P 500 stock prices.☆11Nov 12, 2021Updated 4 years ago
- A simple CDK app written in Kotlin using Gradle DSL☆12Dec 28, 2018Updated 7 years ago
- PredictorFinc is a scalable supervised machine learning model the predicts stock price change through Decision Tree Regressor using data …☆12Sep 5, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Multi-container environment with Hadoop, Spark and Hive☆235May 5, 2025Updated last year
- Starting up a Kubernetes cluster with Vagrant, with Gluster, Portworx, Linstor, or StorageOS as storage provider and Traefik as ingress c…☆11May 25, 2022Updated 4 years ago
- Hadoop, Hive and PrestoDB for deployment using Docker☆27Oct 21, 2025Updated 8 months ago
- A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models.☆11Jul 4, 2021Updated 4 years ago
- Spark Projects for the Berkeley Data Science Course☆13Aug 12, 2015Updated 10 years ago
- Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn☆50Dec 7, 2020Updated 5 years ago
- ☆12May 24, 2022Updated 4 years ago
- ☆14Jul 12, 2020Updated 5 years ago
- Analyzing Big Data with Amazon EMR☆12Sep 14, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Ask question to your PDF☆10Jun 11, 2023Updated 3 years ago
- Manage RabbitMQ with Ansible☆34Jun 22, 2026Updated last week
- This is a pipeline of an ETL application in GCP with open airport code data, which you can find here: https://datahub.io/core/airport-cod…☆15Nov 15, 2021Updated 4 years ago
- GPT3 Chrome Extension Starter Kit☆16Jan 16, 2023Updated 3 years ago
- A template to create CVs/Resumes with Quarto☆12Jul 17, 2023Updated 2 years ago
- noiseprint2 is a porting of noiseprint to tensorflow 2 and keras☆12Feb 20, 2021Updated 5 years ago
- Methods for mapping proteomics data on 3D protein structure.☆15Jan 18, 2020Updated 6 years ago