sodadata / soda-streamingLinks
☆23Updated 4 years ago
Alternatives and similar repositories for soda-streaming
Users that are interested in soda-streaming are comparing it to the libraries listed below
Sorting:
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆168Updated 3 months ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Updated last week
- An open specification for data products in Data Mesh☆63Updated 2 months ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Updated last year
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆77Updated 4 years ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆259Updated 2 years ago
- ⚠️ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.☆42Updated 11 months ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 3 years ago
- ☆81Updated 7 months ago
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆267Updated 8 months ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 3 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆176Updated 3 weeks ago
- Make simple storing test results and visualisation of these in a BI dashboard☆52Updated 2 months ago
- This repository contains recipes for Apache Pinot.☆32Updated 9 months ago
- Python package for querying iceberg data through duckdb.☆70Updated last year
- Unity Catalog UI☆43Updated last year
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆61Updated 3 years ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆25Updated 7 years ago
- ☆23Updated 4 years ago
- Utility functions for dbt projects running on Spark☆34Updated last week
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆77Updated last week
- Example Dagster Cloud code for the Hooli Data Engineering organization.☆17Updated last month
- DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from da…☆50Updated last month
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆31Updated 2 years ago
- Data Product Portal created by Dataminded☆196Updated this week
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆175Updated this week
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆74Updated 2 years ago