sodadata / soda-streamingLinks
☆23Updated 4 years ago
Alternatives and similar repositories for soda-streaming
Users that are interested in soda-streaming are comparing it to the libraries listed below
Sorting:
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated last week
- An open specification for data products in Data Mesh☆63Updated 4 months ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Updated last week
- Make simple storing test results and visualisation of these in a BI dashboard☆51Updated last month
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆76Updated 4 years ago
- Delta Lake helper methods. No Spark dependency.☆22Updated last week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆62Updated 3 years ago
- Delta Lake Documentation☆53Updated last year
- Data Mesh Architecture☆84Updated 3 months ago
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated 2 years ago
- Sample configuration to deploy a modern data platform.☆89Updated 4 years ago
- ☆23Updated 4 years ago
- The Python fake data producer for Apache Kafka® is a complete demo app allowing you to quickly produce JSON fake streaming datasets and …☆84Updated last year
- DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from da…☆50Updated 2 months ago
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆169Updated 4 months ago
- Utility functions for dbt projects running on Spark☆34Updated last month
- A repository of sample code to show data quality checking best practices using Airflow.☆78Updated 2 years ago
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆69Updated 3 weeks ago
- Adapter for dbt that executes dbt pipelines on Apache Flink☆96Updated last year
- Schema modelling framework for decentralised domain-driven ownership of data.☆261Updated 2 years ago
- Library to convert DBT manifest metadata to Airflow tasks☆49Updated last month
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆75Updated 2 years ago
- Streaming demo dbt☆17Updated last year
- Evaluation Matrix for Change Data Capture☆25Updated last year
- Delta lake and filesystem helper methods☆50Updated last year
- A Table format agnostic data sharing framework☆42Updated last year
- Unity Catalog UI☆43Updated last year
- Skeleton project for Apache Airflow training participants to work on.☆17Updated 5 years ago