hussein-awala / spark-on-k8sLinks
A Python package to submit and manage Apache Spark applications on Kubernetes.
☆44Updated 3 months ago
Alternatives and similar repositories for spark-on-k8s
Users that are interested in spark-on-k8s are comparing it to the libraries listed below
Sorting:
- A Python Library to support running data quality rules while the spark job is running⚡☆193Updated this week
- Drop-in replacement for Apache Spark UI☆354Updated last week
- ☆269Updated last year
- Adapter for dbt that executes dbt pipelines on Apache Flink☆96Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆222Updated last week
- Delta Lake examples☆233Updated last year
- A Micosoft Power BI Custom Connector allowing you to import Trino data into Power BI.☆77Updated 10 months ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆77Updated this week
- Helm charts for Trino and Trino Gateway☆187Updated 2 weeks ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an A…☆130Updated 3 weeks ago
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆223Updated 6 months ago
- Data Product Portal created by Dataminded☆195Updated last week
- Delta Lake helper methods in PySpark☆324Updated last year
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆253Updated 2 months ago
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆196Updated this week
- Delta Lake Documentation☆51Updated last year
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆274Updated last month
- Quick Guides from Dremio on Several topics☆79Updated last week
- This repository has moved into https://github.com/dbt-labs/dbt-adapters☆443Updated 4 months ago
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆375Updated 6 months ago
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆124Updated 2 weeks ago
- A library that provides useful extensions to Apache Spark and PySpark.☆230Updated 3 weeks ago
- Open Control Plane for Tables in Data Lakehouse☆371Updated last week
- REST API for Apache Spark on K8S or YARN☆107Updated 2 weeks ago
- The Open-Source Enterprise Data Platform in a single Portal☆261Updated this week
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆149Updated last week
- Python client for Trino☆405Updated 2 months ago
- Grafana dashboards and StatsD exporter config for Airflow monitoring☆288Updated last year
- Delta Lake helper methods. No Spark dependency.☆23Updated last year
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆83Updated 7 months ago