tcmlabs / hexagonal-architecture-python-sparkLinks

Hexagonal (ports and adapters) architecture applied to Spark and Python data engineering project

☆33

Alternatives and similar repositories for hexagonal-architecture-python-spark

Users that are interested in hexagonal-architecture-python-spark are comparing it to the libraries listed below

Sorting:

kaxil / airflowctl
A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects
☆223Updated 7 months ago
Nike-Inc / brickflow
Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
☆222Updated last week
bartosz25 / data-engineering-design-patterns-book
Code snippets for Data Engineering Design Patterns book
☆275Updated 8 months ago
datacontract / datacontract-specification
The Data Contract Specification Repository
☆391Updated 2 months ago
danielbeach / tinytimmy
A simple and easy to use Data Quality (DQ) tool built with Python.
☆50Updated 2 years ago
canimus / cuallee
Possibly the fastest DataFrame-agnostic quality check library in town.
☆227Updated last month
AltimateAI / awesome-data-contracts
A curated list of awesome blogs, videos, tools and resources about Data Contracts
☆180Updated last year
gmyrianthous / dbt-airflow
A Python package that creates fine-grained dbt tasks on Apache Airflow
☆77Updated this week
anna-geller / prefect-dataplatform
Example repository showing how to build a data platform with Prefect, dbt and Snowflake
☆108Updated 2 years ago
paypal / data-contract-template
Template for a data contract used in a data mesh.
☆484Updated last year
Nike-Inc / spark-expectations
A Python Library to support running data quality rules while the spark job is running⚡
☆193Updated this week
pyjaime / docker-airflow-spark
Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks
☆24Updated 3 years ago
MrPowers / chispa
PySpark test helper methods with beautiful error messages
☆730Updated 2 months ago
datacontract / datacontract-cli
Enforce Data Contracts
☆741Updated this week
dagster-io / awesome-dagster
All things awesome related to Dagster!
☆134Updated last month
MrPowers / mack
Delta Lake helper methods in PySpark
☆324Updated last year
anna-geller / dataflow-ops
Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate
☆116Updated 2 years ago
mitchelllisle / sparkdantic
✨ A Pydantic to PySpark schema library
☆112Updated this week
spbail / dag-stack
Data pipeline with dbt, Airflow, Great Expectations
☆165Updated 4 years ago
EcZachly / microbatch-hourly-deduped-tutorial
☆120Updated 4 months ago
sodadata / soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
☆64Updated 3 years ago
josephmachado / simple_dbt_project
Code for dbt tutorial
☆165Updated 2 months ago
mehd-io / pypi-duck-flow
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
☆227Updated last month
marcosmarxm / airflow-testing-ci-workflow
(project & tutorial) dag pipeline tests + ci/cd setup
☆89Updated 4 years ago
bruno-szdl / dbt-ci-cd
☆164Updated 3 months ago
adidas / lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…
☆275Updated last month
hankehly / deploy-airflow-on-ecs-fargate
An example of how to deploy Apache Airflow on Amazon ECS Fargate
☆45Updated 3 years ago
kanton-bern / hellodata-be
The Open-Source Enterprise Data Platform in a single Portal
☆261Updated this week
TJaniF / airflow-elt-blueprint
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
☆79Updated 2 years ago
Armaan1Gohil / dataengineering-tech-stack
Local Environment to Practice Data Engineering
☆143Updated 11 months ago