adidas / m3d-engineLinks
M3D Engine is a Spark application for the development of scalable data transformations and ingestions in data lakes.
☆18Updated 4 years ago
Alternatives and similar repositories for m3d-engine
Users that are interested in m3d-engine are comparing it to the libraries listed below
Sorting:
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆31Updated 2 years ago
- ☆11Updated 5 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Using the Parquet file format with Python☆15Updated last year
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆20Updated 5 years ago
- Kafka Connect playground☆10Updated 5 years ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Unity Catalog UI☆41Updated 10 months ago
- A Data Mesh demo repository☆13Updated 9 months ago
- AWS Quick Start Team☆19Updated 9 months ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- Python code that will collapse structured columns separating out the attributes into new columns☆11Updated 3 years ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated this week
- 💻 CLI for reporting events to Faros platform☆14Updated 2 months ago
- dbt package for monitoring airflow DAGs and tasks☆29Updated 5 months ago
- Ansible roles to deploy Kubernetes, JupyterHub, Jupyter Enterprise Gateway and Spark on Kubernetes cluster☆38Updated 4 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- Example project using DBT, Databricks and AdventureWorks sample database☆12Updated 2 years ago
- Repository containing various utils related to Snowflake migration at Faire.☆12Updated 2 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 5 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆46Updated last year
- This repository is no longer maintained.☆15Updated 3 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Explore Apache Kafka data pipelines in Kubernetes.☆46Updated 2 weeks ago
- FADI - Ingest, store and analyse big data flows☆46Updated last year
- Useful scripts, utilities, and tools for Snowflake☆13Updated 5 years ago
- ☆95Updated 2 years ago
- Hadoop/Hive/Spark container to perform CI tests☆11Updated 4 years ago