tcmlabs / hexagonal-architecture-python-spark
Hexagonal (ports and adapters) architecture applied to Spark and Python data engineering project
☆32Updated last year
Alternatives and similar repositories for hexagonal-architecture-python-spark:
Users that are interested in hexagonal-architecture-python-spark are comparing it to the libraries listed below
- Modern Data Engineering Project☆11Updated 2 years ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆192Updated last month
- A simple and easy to use Data Quality (DQ) tool built with Python.☆49Updated last year
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆204Updated this week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆180Updated this week
- ☆107Updated 5 months ago
- The Open-Source Enterprise Data Platform in a single Portal☆227Updated this week
- ☆43Updated 3 years ago
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆82Updated this week
- Example Repo to have full end to end pyspark testing via docker-compose☆30Updated last year
- Food for thoughts around data contracts☆24Updated last week
- Edit your data contract in the Data Contract Editor☆15Updated 3 months ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆231Updated 2 months ago
- The Data Contract Specification Repository☆299Updated this week
- Fake Snowflake Connector for Python. Run, mock and test Snowflake DB locally.☆112Updated 2 weeks ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆62Updated 3 months ago
- csv and flat-file sniffer built in Rust.☆42Updated 11 months ago
- A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation an…☆23Updated last year
- Code snippets for Data Engineering Design Patterns book☆49Updated last week
- Playing with different packages of the Apache Spark☆27Updated 7 months ago
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆62Updated last year
- ☆72Updated 3 months ago
- A guide for leading a data (engineering) team☆62Updated 8 months ago
- ✨ A Pydantic to PySpark schema library☆63Updated this week
- Example repository showing how to build a data platform with Prefect, dbt and Snowflake☆97Updated last year
- Delta Lake helper methods in PySpark☆312Updated 4 months ago
- Black for Databricks notebooks☆44Updated last week
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆46Updated 2 months ago