lmassaoy / spark-on-k8sLinks
Presenting 3 ways to run Spark over containers, this project is recommended to those who seek to explore Big Data out of a Hadoop Cluster.
☆10Updated 4 years ago
Alternatives and similar repositories for spark-on-k8s
Users that are interested in spark-on-k8s are comparing it to the libraries listed below
Sorting:
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Updated 2 years ago
- Grafana dashboards and StatsD exporter config for Airflow monitoring☆289Updated last year
- ☆23Updated 2 years ago
- ☆60Updated last year
- Airflow Deployment on AWS ECS Fargate Using Cloudformation☆205Updated 3 years ago
- ☆21Updated 3 years ago
- Apache Flink (Pyflink) and Related Projects☆41Updated 6 months ago
- Sample Airflow DAGs☆63Updated 2 years ago
- Airflow Unit Tests and Integration Tests☆261Updated 2 years ago
- Airflow plugin to export dag and task based metrics to Prometheus.☆263Updated 2 weeks ago
- A repository of sample code to show data quality checking best practices using Airflow.☆78Updated 2 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆181Updated 3 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆495Updated 2 years ago
- Airflow Examples: code samples for Medium articles☆14Updated 4 years ago
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28Updated 5 months ago
- Airflow support for Marquez☆31Updated 4 years ago
- Spark development environment for kubernetes, spark-submit and jupyter notebook☆19Updated 3 years ago
- Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)☆186Updated last year
- A Helm chart to install Apache Airflow on Kubernetes☆290Updated this week
- Delta-Lake, ETL, Spark, Airflow☆48Updated 3 years ago
- This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/dat…☆18Updated 3 years ago
- A simple spark standalone cluster for your testing environment purposses☆570Updated last year
- Materials for the next course☆25Updated 2 years ago
- A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces☆326Updated 4 years ago
- Data Engineering com Apache Spark☆42Updated 4 years ago
- ☆13Updated 8 months ago
- ☆40Updated 4 years ago
- ☆17Updated last year
- A custom sink provider for Apache Spark that sends the content of a dataframe to an AWS SQS☆23Updated last year
- A repository of sample code to accompany our blog post on Airflow and dbt.☆179Updated 2 years ago