adidas / lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
☆223Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for lakehouse-engine
- Delta Lake helper methods in PySpark☆304Updated 2 months ago
- A Python Library to support running data quality rules while the spark job is running⚡☆163Updated last week
- Delta Lake examples☆207Updated last month
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆189Updated last week
- Data product portal created by Dataminded☆146Updated this week
- Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline☆151Updated 3 months ago
- Code samples, etc. for Databricks☆60Updated last month
- Databricks Implementation of the TPC-DI Specification using Traditional Notebooks and/or Delta Live Tables☆75Updated this week
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆41Updated 3 weeks ago
- DBSQL SME Repo contains demos, tutorials, blog code, advanced production helper functions and more!☆37Updated this week
- PySpark test helper methods with beautiful error messages☆621Updated 3 weeks ago