JacekMajchrzak / awesome-datamesh
☆89Updated last year
Related projects: ⓘ
- An open specification for data products in Data Mesh☆53Updated 9 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆68Updated this week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆60Updated last year
- Sample configuration to deploy a modern data platform.☆84Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆91Updated this week
- A K8s-based infrastructure for analytics☆24Updated 4 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- Data validation library for PySpark 3.0.0☆34Updated last year
- ☆37Updated 6 months ago
- The Data Contract Specification Repository☆231Updated this week
- ☆20Updated 3 years ago
- Data Mesh Architecture☆70Updated 2 months ago
- Data Tools Subjective List☆80Updated last year
- Sample Airflow DAGs☆60Updated last year
- A Data Mesh proof-of-concept built on Confluent Cloud☆2Updated last year
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆64Updated 3 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆167Updated 10 months ago
- A Table format agnostic data sharing framework☆36Updated 7 months ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆158Updated last month
- Delta Lake Documentation☆45Updated 3 months ago
- Extensible Rules Engine for custom Dataframe / Dataset validation☆134Updated 4 months ago
- The Data Product Descriptor Specification (DPDS) Repository☆66Updated this week
- Pylint plugin for static code analysis on Airflow code☆89Updated 3 years ago
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆141Updated 2 weeks ago
- ☆32Updated 3 months ago
- Data product portal created by Dataminded☆123Updated this week
- ☆195Updated 11 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 2 weeks ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆43Updated last year
- The go to demo for public and private dbt Learn☆70Updated 2 weeks ago