tcmlabs / hexagonal-architecture-python-sparkLinks
Hexagonal (ports and adapters) architecture applied to Spark and Python data engineering project
☆33Updated 2 years ago
Alternatives and similar repositories for hexagonal-architecture-python-spark
Users that are interested in hexagonal-architecture-python-spark are comparing it to the libraries listed below
Sorting:
- Possibly the fastest DataFrame-agnostic quality check library in town.☆233Updated last month
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated 2 years ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆222Updated 3 weeks ago
- Modern Data Engineering Project☆12Updated 3 years ago
- Code snippets for Data Engineering Design Patterns book☆296Updated last week
- A Series of Notebooks on how to start with Kafka and Python☆152Updated 9 months ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Updated 3 years ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆180Updated last year
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆225Updated 7 months ago
- Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate☆116Updated 2 years ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆80Updated this week
- Food for thoughts around data contracts☆29Updated 5 months ago
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆79Updated 2 years ago
- Delta Lake helper methods in PySpark☆325Updated last year
- PySpark test helper methods with beautiful error messages☆740Updated last week
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆278Updated 2 months ago
- ☆120Updated 5 months ago
- The Data Contract Specification Repository☆401Updated 2 weeks ago
- Example repo to create end to end tests for data pipeline.☆25Updated last year
- A Python PySpark Projet with Poetry☆24Updated 5 months ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- A Python Library to support running data quality rules while the spark job is running⚡☆193Updated this week
- Example repository showing how to build a data platform with Prefect, dbt and Snowflake☆108Updated 2 years ago
- 🏃♀️ Minimalist SQL orchestrator☆295Updated last week
- Delta-Lake, ETL, Spark, Airflow☆48Updated 3 years ago
- ☆42Updated 4 years ago
- ☆169Updated 4 months ago
- Template for a data contract used in a data mesh.☆486Updated last year
- PySpark schema generator☆43Updated 2 years ago
- Enforce Data Contracts☆778Updated last week