adidas / m3d-engine
M3D Engine is a Spark application for the development of scalable data transformations and ingestions in data lakes.
☆18Updated 3 years ago
Alternatives and similar repositories for m3d-engine:
Users that are interested in m3d-engine are comparing it to the libraries listed below
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆31Updated last year
- Spark app to merge different schemas☆23Updated 4 years ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- Unity Catalog UI☆39Updated 5 months ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- event-triggered plugins for airflow☆21Updated 5 years ago
- AWS Quick Start Team☆18Updated 5 months ago
- ☆11Updated 5 years ago
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- Hadoop/Hive/Spark container to perform CI tests☆11Updated 4 years ago
- A Table format agnostic data sharing framework☆38Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoring☆24Updated 6 months ago
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆44Updated last month
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago
- Delta Lake Documentation☆48Updated 8 months ago
- A K8s-based infrastructure for analytics☆24Updated 5 years ago
- Rules based grant management for Snowflake☆40Updated 6 years ago
- ☆14Updated 3 weeks ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆74Updated this week
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Example script to deploy DAGs to Google Cloud Composer.☆15Updated 2 years ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- CI/CD for Snowflake using Jenkins and Sqitch☆8Updated 5 years ago
- Python code that will collapse structured columns separating out the attributes into new columns☆11Updated 2 years ago
- Building Json data pipeline within Snowflake using Streams and Tasks☆26Updated 5 years ago
- dbt package for monitoring airflow DAGs and tasks☆29Updated 2 weeks ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆12Updated last week
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year