bloomberg / apache-spark-on-k8sLinks
Apache Spark enhanced with native Kubernetes scheduler back-end
☆15Updated last year
Alternatives and similar repositories for apache-spark-on-k8s
Users that are interested in apache-spark-on-k8s are comparing it to the libraries listed below
Sorting:
- Ansible playbooks for Apache Spark on kube☆27Updated 7 years ago
- Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.☆21Updated 7 months ago
- ☆11Updated 5 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 months ago
- Ansible roles to deploy Kubernetes, JupyterHub, Jupyter Enterprise Gateway and Spark on Kubernetes cluster☆38Updated 4 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- Lightweight configuration and access to multiple databases in a single project☆38Updated last year
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- A facebook for data☆26Updated 6 years ago
- This repository contains recipes for Apache Pinot.☆30Updated 3 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated this week
- Bytewax Helm charts repository☆12Updated last year
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Spark Scala docker container sample for AWS testing - EKS & S3☆24Updated 6 years ago
- 💻 CLI for reporting events to Faros platform☆14Updated last month
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Fybrik platform - Arrow/Flight module☆16Updated 10 months ago
- A toolset to streamline running spark python on EMR☆20Updated 8 years ago
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Updated 2 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 5 years ago
- Testing Scala code with scalatest☆12Updated 2 years ago
- A docker image with a pre-configured Hive Metastore and a Spark ThriftServer☆19Updated 5 years ago
- pysh-db - The Data Science Toolkit (DSK)☆13Updated 6 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- Kafka Connect playground☆10Updated 5 years ago
- Docker Image and Kubernetes Configurations for Spark 2.x☆41Updated 5 years ago
- Utility functions for dbt projects running on Spark☆34Updated 4 months ago
- An example PySpark project with pytest☆16Updated 7 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆45Updated 2 years ago