PApostol / spark-submitLinks
Python manager for spark-submit jobs
☆10Updated last year
Alternatives and similar repositories for spark-submit
Users that are interested in spark-submit are comparing it to the libraries listed below
Sorting:
- ✨ A Pydantic to PySpark schema library☆96Updated this week
- A Python Library to support running data quality rules while the spark job is running⚡☆188Updated this week
- Turning PySpark Into a Universal DataFrame API☆407Updated this week
- ☆74Updated 4 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆217Updated last week
- prefect integration for running dbt☆62Updated 9 months ago
- A provider package for kafka☆37Updated last year
- Great Expectations Airflow operator☆166Updated this week
- dbt + Trino demo project, using TPC-H sample data☆19Updated last year
- Quick Guides from Dremio on Several topics☆71Updated this week
- ☆132Updated last month
- Read Apache Arrow batches from ODBC data sources in Python☆65Updated this week
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆70Updated 9 months ago
- Pythonic Iceberg REST Catalog☆2Updated last week
- Delta Lake helper methods. No Spark dependency.☆23Updated 9 months ago
- Adapter for dbt that executes dbt pipelines on Apache Flink☆95Updated last year
- Run, mock and test fake Snowflake databases locally.☆143Updated last week
- Read Delta tables without any Spark☆47Updated last year
- Possibly the fastest DataFrame-agnostic quality check library in town.☆195Updated last week
- Delta Lake examples☆225Updated 8 months ago
- Delta Lake helper methods in PySpark☆323Updated 9 months ago
- A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projects☆168Updated 3 months ago
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆148Updated this week
- A dbt artifacts parser in python☆93Updated last week
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆238Updated 3 weeks ago
- ☆26Updated last year
- A dbt-core plugin to weave together multi-project dbt-core deployments☆155Updated last week
- ☆43Updated 3 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆43Updated last week