bloomberg / apache-spark-on-k8s
Apache Spark enhanced with native Kubernetes scheduler back-end
☆15Updated last year
Alternatives and similar repositories for apache-spark-on-k8s:
Users that are interested in apache-spark-on-k8s are comparing it to the libraries listed below
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Bytewax Helm charts repository☆12Updated 11 months ago
- This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS …☆19Updated 3 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated last week
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- Example script to deploy DAGs to Google Cloud Composer.☆15Updated 2 years ago
- Ansible playbooks for Apache Spark on kube☆27Updated 7 years ago
- 🐋 Docker image for AWS Glue Spark/Python☆23Updated last year
- Ansible roles to deploy Kubernetes, JupyterHub, Jupyter Enterprise Gateway and Spark on Kubernetes cluster☆39Updated 4 years ago
- ☆11Updated 5 years ago
- Utility functions for dbt projects running on Spark☆32Updated 2 months ago
- A facebook for data☆26Updated 5 years ago
- event-triggered plugins for airflow☆21Updated 5 years ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆75Updated 6 years ago
- Contains example dags and terraform code to create a composer with a node pool to run pods☆13Updated 4 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated this week
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- Udacity Data Pipeline Exercises☆15Updated 4 years ago
- ☆28Updated 7 months ago
- ☆24Updated 5 years ago
- Receipes of publicly-available Jupyter images☆8Updated last month
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Updated 5 years ago
- Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.☆21Updated 5 months ago
- ☆10Updated 2 years ago
- Cloudbox Labs blog code☆35Updated 6 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Ibis analytics, with Ibis (and more!)☆21Updated 7 months ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago
- Amazon EMR on EKS Custom Image CLI☆31Updated 7 months ago
- Utilities to help HBase as a service in HDInsight Azure☆14Updated last year