adidas / m3d-engineLinks
M3D Engine is a Spark application for the development of scalable data transformations and ingestions in data lakes.
☆19Updated 4 years ago
Alternatives and similar repositories for m3d-engine
Users that are interested in m3d-engine are comparing it to the libraries listed below
Sorting:
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆33Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Updated last week
- ☆27Updated last year
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆20Updated 5 years ago
- Curated list of resources about Apache Airflow☆19Updated 4 years ago
- Dremio Container Tools☆163Updated 3 months ago
- Databricks Migration Tools☆43Updated 4 years ago
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 3 years ago
- ☆98Updated 2 years ago
- Performance optimization for Spark running on Kubernetes☆89Updated 5 years ago
- Setup for running Trino with Hive Metastore on Kubernetes☆103Updated 3 years ago
- Yet Another (Spark) ETL Framework☆21Updated 2 years ago
- Prometheus Exporter for Airflow☆161Updated last year
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆354Updated this week
- Airflow support for Marquez☆31Updated 5 years ago
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆123Updated last week
- Sample Airflow DAGs☆64Updated 3 years ago
- A simple Spark-powered ETL framework that just works 🍺☆181Updated 2 months ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Updated 6 years ago
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆267Updated 8 months ago
- Spark on Kubernetes infrastructure Helm charts repo☆202Updated 3 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago
- Accompanying code examples for webinar and blog post "three ways to run airflow on kubernetes"☆15Updated 5 years ago
- Delta Lake Documentation☆51Updated last year
- The Internals of Spark on Kubernetes☆72Updated 3 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆102Updated 2 years ago
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆46Updated 10 months ago
- spark on kubernetes☆104Updated 2 years ago
- Rules based grant management for Snowflake☆41Updated 6 years ago