JacekMajchrzak / awesome-datamesh
☆93Updated last year
Alternatives and similar repositories for awesome-datamesh:
Users that are interested in awesome-datamesh are comparing it to the libraries listed below
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆74Updated this week
- An open specification for data products in Data Mesh☆55Updated 2 months ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆93Updated this week
- Sample configuration to deploy a modern data platform.☆87Updated 3 years ago
- A Table format agnostic data sharing framework☆38Updated 11 months ago
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated 2 years ago
- A K8s-based infrastructure for analytics☆24Updated 5 years ago
- Yet Another (Spark) ETL Framework☆18Updated last year
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated last year
- Great Expectations Airflow operator☆160Updated 2 months ago
- Data Mesh Architecture☆74Updated 6 months ago
- ☆196Updated last year
- Rules based grant management for Snowflake☆40Updated 5 years ago
- ☆47Updated 5 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 3 weeks ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆65Updated 3 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Data Tools Subjective List☆82Updated last year
- ⚠️ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.☆41Updated last week
- A Python Library to support running data quality rules while the spark job is running⚡☆167Updated last week
- The Data Product Descriptor Specification (DPDS) Repository☆76Updated this week
- A repository of sample code to accompany our blog post on Airflow and dbt.☆168Updated last year
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago