lmassaoy / spark-on-k8s
Presenting 3 ways to run Spark over containers, this project is recommended to those who seek to explore Big Data out of a Hadoop Cluster.
☆10Updated 4 years ago
Alternatives and similar repositories for spark-on-k8s:
Users that are interested in spark-on-k8s are comparing it to the libraries listed below
- ☆15Updated 10 months ago
- ☆23Updated last year
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28Updated last month
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Updated 2 years ago
- ☆22Updated 3 years ago
- Deploy of Airflow 2.0 using ECS Fargate and AWS CDK.☆14Updated 3 years ago
- Data Engineering com Apache Spark☆43Updated 3 years ago
- This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/dat…☆17Updated 3 years ago
- ☆23Updated last year
- ☆61Updated 11 months ago
- Instalador autonomo do Apache Spark para Sistemas linux: based(Debian,RHEL)☆13Updated 2 months ago
- Spark development environment for kubernetes, spark-submit and jupyter notebook☆19Updated 3 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated last year
- 🐋 Docker image for AWS Glue Spark/Python☆23Updated last year
- Delta-Lake, ETL, Spark, Airflow☆46Updated 2 years ago
- Sample Airflow DAGs☆62Updated 2 years ago
- Football scouts from Cartola FC at a data lake with data warehouse and dashboard☆17Updated 2 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- ☆43Updated 2 years ago
- Docker Apache Airflow☆31Updated 3 years ago
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆68Updated 4 years ago
- A data engineering personal project for applying some of my skills☆19Updated 3 years ago
- ☆40Updated 4 years ago
- Big Data Ecosystem Docker☆402Updated last year
- ☆10Updated this week
- ☆20Updated 3 years ago
- ☆15Updated last month
- A client for connecting and running DDLs on hive metastore.☆55Updated 11 months ago
- ☆37Updated 6 months ago
- An exercise running Kafka, Kafka Connect, PostgreSQL, Superset and AWS S3☆21Updated 3 years ago