adidas / lakehouse-engineView external linksLinks
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
☆279Oct 7, 2025Updated 4 months ago
Alternatives and similar repositories for lakehouse-engine
Users that are interested in lakehouse-engine are comparing it to the libraries listed below
Sorting:
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆226Feb 2, 2026Updated last week
- A cloud data platform product to accelerate time to insights. Our open-source framework is designed for the real world. Stripping away th…☆23Jan 30, 2026Updated 2 weeks ago
- A Python Library to support running data quality rules while the spark job is running⚡☆197Updated this week
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated 3 weeks ago
- ☆18Aug 6, 2024Updated last year
- A DataOps framework for building a lakehouse.☆56Feb 5, 2026Updated last week
- PySpark test helper methods with beautiful error messages☆752Jan 13, 2026Updated last month
- ☆18May 26, 2025Updated 8 months ago
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆55Oct 13, 2025Updated 4 months ago
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆46Jan 27, 2025Updated last year
- Open, Multi-modal Catalog for Data & AI☆3,304Feb 6, 2026Updated last week
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28May 19, 2025Updated 8 months ago
- 🏃♀️ Minimalist SQL orchestrator☆302Updated this week
- pyspark methods to enhance developer productivity 📣 👯 🎉☆682Mar 6, 2025Updated 11 months ago
- Databricks Platform - Architecture, Security, Automation and much more!!☆52Updated this week
- Cost Efficient Data Pipelines with DuckDB☆61May 14, 2025Updated 9 months ago
- Open Control Plane for Tables in Data Lakehouse☆380Updated this week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆236Feb 5, 2026Updated last week
- Upload of all my presentations which I've been doing in the past☆10Feb 5, 2026Updated last week
- Data Contracts engine for the modern data stack. https://www.soda.io☆2,288Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,413Updated this week
- Code snippets used in demos recorded for the blog.☆37Jan 17, 2026Updated 3 weeks ago
- ☆49Oct 15, 2024Updated last year
- Monitor data sources and track changes over time 🐿️☆11Nov 7, 2024Updated last year
- OCaml and Rust-style exhaustive exception handling for Python.☆33Jan 2, 2026Updated last month
- A benchmark tool for lakehouses.☆14Mar 12, 2023Updated 2 years ago
- A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration…☆187Feb 13, 2024Updated 2 years ago
- ☆23May 16, 2023Updated 2 years ago
- Testing framework for Databricks notebooks☆315Apr 20, 2024Updated last year
- Delta lake and filesystem helper methods☆50Feb 29, 2024Updated last year
- In this article, you will learn how to set up a real-time data processing and analytics environment using Docker, MySQL, Redpanda, MinIO,…☆11Jun 27, 2023Updated 2 years ago
- adidas Data Mesh implementation☆12May 13, 2022Updated 3 years ago
- ☆12Mar 7, 2025Updated 11 months ago
- Kafka Connect: How to create a real time data pipeline using Change Data Capture (CDC)☆13Jan 24, 2021Updated 5 years ago
- Data pipeline project using Data Factory, Databricks and Cosmosdb Graph, deployed using Azure DevOps, secured using firewalls and Azure A…☆11Dec 14, 2022Updated 3 years ago
- Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipeli…☆651Feb 1, 2026Updated last week
- Fake Pandas / PySpark DataFrame creator☆48Mar 10, 2024Updated last year
- Scalable and efficient data transformation framework - backwards compatible with dbt.☆2,891Updated this week
- ☆14Nov 10, 2023Updated 2 years ago