Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.
☆169Sep 13, 2025Updated 6 months ago
Alternatives and similar repositories for datagen
Users that are interested in datagen are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆41Mar 18, 2024Updated 2 years ago
- Demos using Conduktor Gateway☆18Apr 11, 2024Updated last year
- CLI for scraping, querying and visualizing Prometheus metrics.☆17Mar 14, 2026Updated last week
- Demos of Materialize, the operational data warehouse.☆52Mar 5, 2025Updated last year
- A Terraform provider for Materialize☆14Mar 16, 2026Updated last week
- Multi-hop declarative data pipelines☆125Updated this week
- Schema Registry Statistics Tool☆24Updated this week
- A list of all awesome open-source contributions for the Apache Kafka project☆111Jul 10, 2023Updated 2 years ago
- ☆26Nov 28, 2022Updated 3 years ago
- Kafka Connector for Iceberg tables☆16Jul 24, 2023Updated 2 years ago
- Serverless multi-protocol + multi-destination event collection system.☆210Nov 24, 2024Updated last year
- Work with your web service, database, and streaming schemas in a single format.☆351Dec 30, 2025Updated 2 months ago
- A dbt adapter for Decodable☆12Sep 4, 2025Updated 6 months ago
- The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL☆6,251Updated this week
- Rust build system integration for protobuf, Google's data interchange format.☆21Jan 9, 2026Updated 2 months ago
- Kafka Connect JSONata Transform☆12Feb 24, 2025Updated last year
- Time-related test utilities for Java☆22Mar 12, 2026Updated last week
- Supporting materials/code examples for my course in data engineering for machine learning.☆39Nov 15, 2022Updated 3 years ago
- 🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊☆897Updated this week
- This library contains the Kinesis Analytics stream processing runtime configuration classes.☆11Jan 26, 2026Updated last month
- Example GitHub Actions for Apache Kafka client application development for local and Confluent Cloud☆15Aug 1, 2022Updated 3 years ago
- This is a comprehensive end-to-end data engineering project. I extracted data directly from YouTube in raw JSON format using Python and A…☆11Jun 4, 2024Updated last year
- Conduit streams data between data stores. Kafka Connect replacement. No JVM required.☆584Updated this week
- 🚀 Example configuration files to help you get started.☆46Mar 16, 2026Updated last week
- Benchmarks to read parquet to arrow☆11Dec 25, 2022Updated 3 years ago
- This a simple Python daemon to monitor your Impala nodes.☆10Apr 13, 2021Updated 4 years ago
- Send customized alerts for your dbt project with simple tags☆10Jul 27, 2021Updated 4 years ago
- Karapace - Your Apache Kafka® essentials in one tool☆603Updated this week
- ☆23Jun 3, 2021Updated 4 years ago
- Package to assert rows in-line with dbt macros.☆71Nov 25, 2025Updated 3 months ago
- ZSH plugin to have Kafka automatic completion for most CLI tools☆68Aug 19, 2022Updated 3 years ago
- A python library bakeoff for medium sized datasets☆24Aug 25, 2023Updated 2 years ago
- ☆81Apr 23, 2025Updated 11 months ago
- Set up a Cost-Effective Modern Data Stack for a Charity☆19Mar 26, 2025Updated 11 months ago
- Self-contained demo using Kafka, Materialize and Metabase to check what's streaming on Twitch. All you need is Docker and Twitch access t…☆25Mar 22, 2022Updated 4 years ago
- Set of small KafkaStreams demos/tests☆14May 14, 2019Updated 6 years ago
- This library contains various Apache Flink connectors to connect to AWS data sources and sinks.☆16Dec 5, 2023Updated 2 years ago
- chDB AWS Lambda container☆18Aug 31, 2023Updated 2 years ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆261Dec 5, 2023Updated 2 years ago