arunma / datagen
An easy to use tool to generate fake/dummy data in bulk and export it as JSON, CSV, Avro or directly into your database as tables. Written in Rust.
☆9Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for datagen
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 9 years ago
- ☆22Updated 5 years ago
- Data Catalog for Databases and Data Warehouses☆31Updated 10 months ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆16Updated 3 years ago
- Mock streaming data generator☆16Updated 5 months ago
- ☆11Updated last year
- Skeleton project for Apache Airflow training participants to work on.☆16Updated 4 years ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium☆27Updated 3 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 9 years ago
- Is using KoP (Kafka-On-Pulsar) a good idea? Use the scenarios implemented in this repository to check whether Pulsar with KoP enabled is …☆10Updated 2 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆28Updated 4 years ago
- FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...☆17Updated this week
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆58Updated last year
- Code for the fictitious food delivery company GottaEat used in the Pulsar In Action book☆17Updated 2 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated last year
- This repository contains a recipe for bootstrapping a climate analysis application using Apache Pinot and Superset☆20Updated 4 years ago
- A K8s-based infrastructure for analytics☆24Updated 4 years ago
- Apache Spark based framework for analysis A/B experiments☆11Updated 2 weeks ago
- Events about the open source data stack☆13Updated 2 years ago
- Apache Solr: Because your Database is not a Search Engine☆12Updated 5 years ago
- A starter project to create Arc jobs using the Jupyter Notebook interface☆22Updated 3 years ago
- Telecom scenarios implemented with streaming techniques☆11Updated last year
- Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect☆13Updated last month
- Profiles the data, validates the schema and runs data quality checks and produces a report☆20Updated 5 years ago
- BigQuery Schema Conversion Tool☆23Updated 4 years ago
- Service for automatically managing and cleaning up unreferenced data☆45Updated this week
- TLA+ specs for table formats☆13Updated last month
- ARCHIVED: Run Debezium/KafkaConnect CDC components in Kubernetes☆24Updated 5 years ago