arunma / datagen

An easy to use tool to generate fake/dummy data in bulk and export it as JSON, CSV, Avro or directly into your database as tables. Written in Rust.

☆9

Alternatives and similar repositories for datagen:

Users that are interested in datagen are comparing it to the libraries listed below

ArroyoSystems / streamgen
Mock streaming data generator
☆16Updated 7 months ago
gunnarmorling / kcetcd
An example source connector for Kafka Connect, ingesting data from etcd
☆11Updated 2 years ago
cyanfr / dbvis_to_hortonworks_hiveserver2
Connect DBVisualizer to Hortonwork HiveServer2
☆9Updated 9 years ago
tspannhw / FLiPStackWeekly
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
☆18Updated this week
kbastani / pinot-debezium-basic-example
This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium
☆27Updated 4 years ago
anemos-io / protobeam
☆22Updated 5 years ago
wirelessr / flink-iceberg-playground
minio as local storage and DynamoDB as catalog
☆13Updated 8 months ago
bullet-db / bullet-core
Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…
☆41Updated 2 years ago
InMobi / docker-hive
Docker image for Apache Hive running on Tez
☆7Updated 10 years ago
Aiven-Open / guardian-for-apache-kafka
Set of tools for creating backups, compaction and restoration of Apache Kafka® Clusters
☆19Updated last week
kdrakon / topiks
An interactive CLI tool for managing Kafka topics
☆28Updated 5 years ago
steveloughran / zero-rename-committer
Paper: A Zero-rename committer for object stores
☆20Updated 3 years ago
riferrei / is-using-kop-a-good-idea
Is using KoP (Kafka-On-Pulsar) a good idea? Use the scenarios implemented in this repository to check whether Pulsar with KoP enabled is …
☆10Updated 2 years ago
stackabletech / opa-operator
A kubernetes operator for the Open Policy Agent
☆16Updated this week
gunnarmorling / pgoutput-cli
A command line client for consuming Postgres logical decoding events in the pgoutput format
☆11Updated 6 months ago
youngwookim / awesome-presto
A curated list of awesome PrestoDB / Trino software, libraries, tools and resources
☆17Updated 3 years ago
MrPowers / scalatest-example
Testing Scala code with scalatest
☆12Updated 2 years ago
lensesio / avro-sql
Use SQL to transform your avro schema/records
☆28Updated 7 years ago
buoyant-data / oxbow
Collection of AWS Lambdas for creating and managing Delta tables
☆24Updated this week
tokern / dbcat
Data Catalog for Databases and Data Warehouses
☆31Updated last year
rdblue / parquet-cli
Parquet Command-line Tools
☆18Updated 8 years ago
tlepple / data_origination_workshop
Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect
☆14Updated 3 months ago
joerg-schneider / airtunnel
The sane way of building a data layer in Airflow
☆24Updated 5 years ago
tomncooper / streaming-sql
Kubernetes deployments and examples for various streaming SQL implementations
☆10Updated 2 years ago
stackabletech / airflow-operator
Stackable Operator for Apache Airflow
☆22Updated this week
fithisux / experiment-with-trino-minio-hive
☆13Updated last year
paypal / dione
Dione - a Spark and HDFS indexing library
☆50Updated 9 months ago
david-streamlio / GottaEat
Code for the fictitious food delivery company GottaEat used in the Pulsar In Action book
☆17Updated 2 years ago