MaterializeInc / datagen
Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.
☆157Updated 4 months ago
Alternatives and similar repositories for datagen:
Users that are interested in datagen are comparing it to the libraries listed below
- Schema modelling framework for decentralised domain-driven ownership of data.☆251Updated last year
- Multi-hop declarative data pipelines☆112Updated last week
- In-Memory Analytics for Kafka using DuckDB☆108Updated this week
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆76Updated last month
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆238Updated last week
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆231Updated this week
- The bridge to effortless multi-engine data applications, currently supports Snowflake ❄️ and DuckDB 🦆☆174Updated last week
- Quick Guides from Dremio on Several topics☆69Updated 2 months ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- Open Control Plane for Tables in Data Lakehouse☆333Updated this week
- ☆79Updated last year
- ☆244Updated this week
- A Table format agnostic data sharing framework☆38Updated last year
- Adapter for dbt that executes dbt pipelines on Apache Flink☆92Updated last year
- This is the main repository for SDF documentation found at docs.sdf.com, as well as public schemas, benchmarks, and examples☆117Updated last month
- Work with your web service, database, and streaming schemas in a single format.☆343Updated last year
- Data Tools Subjective List☆83Updated last year
- Apache Hive Metastore as a Standalone server in Docker☆68Updated 7 months ago
- Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principle…☆89Updated this week
- ☆71Updated 2 months ago
- Open, Multi-modal Catalog for Data & AI, written in Rust☆78Updated 6 months ago
- Proof-of-concept extension combining the delta extension with Unity Catalog☆78Updated 3 weeks ago
- Enables Python developers to leverage Debezium's CDC capabilities with custom event handlers and seamless integration.☆24Updated last month
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆84Updated this week
- Example Dagster Cloud code for the Hooli Data Engineering organization.☆1Updated last week
- Use dbt to manage real-time data transformations in RisingWave.☆25Updated 3 weeks ago
- dbt (data build tool) adapter for the Dremio☆50Updated last week
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆167Updated 2 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆134Updated 2 months ago
- A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.☆39Updated 7 months ago