FINRAOS / DataGenerator
DataGenerator is a Java library for systematically producing large volumes of data. DataGenerator frames data production as a modeling problem, with a user providing a model of dependencies among variables and the library traversing the model to produce relevant data sets.
☆162Updated 2 years ago
Alternatives and similar repositories for DataGenerator:
Users that are interested in DataGenerator are comparing it to the libraries listed below
- Mirror of Apache Apex malhar☆132Updated 5 years ago
- An Open Source unit test framework for Hive queries based on JUnit 4 and 5☆256Updated 2 months ago
- The DataHelix generator allows you to quickly create data, based on a JSON profile that defines fields and the relationships between them…☆142Updated last year
- Apache Fluo☆188Updated last week
- spark + drools☆102Updated 2 years ago
- Java library for generating test data☆170Updated 4 years ago
- A visual ETL development and debugging tool for big data☆153Updated 2 years ago
- CDAP Applications☆43Updated 7 years ago
- Kite SDK Examples☆99Updated 3 years ago
- ☆204Updated last year
- Generate Avro schema and Avro binary from XSD schema and XML☆68Updated 8 years ago
- Next-generation web analytics processing with Scala, Spark, and Parquet.☆331Updated 10 years ago
- Functional testing framework for Big Data pipelines.☆56Updated last year
- Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabyt…☆135Updated 2 years ago
- Mirror of Apache Apex core☆349Updated 3 years ago
- Apache Spark applications☆70Updated 7 years ago
- Apache Ignite Extensions☆46Updated 7 years ago
- Complex Event Processing on top of Kafka Streams☆311Updated last year
- The SpliceSQL Engine☆168Updated last year
- Fast and efficient batch computation engine for complex analysis and reporting of massive datasets on Hadoop☆243Updated 9 years ago
- The Schema Repo is a RESTful web service for storing and serving mappings between schema identifiers and schema definitions.☆156Updated 2 years ago
- Hadoop-Unit is a project which allow testing projects which need hadoop ecosysteme like kafka, solr, hdfs, hive, hbase, ...☆52Updated 2 years ago
- A simple storm performance/stress test☆74Updated 2 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆95Updated 5 years ago
- Support Highcharts in Apache Zeppelin☆81Updated 7 years ago
- Schema Registry☆16Updated 9 months ago
- Code to index Hive tables to Solr and Solr indexes to Hive☆48Updated 5 years ago
- Apache Streams☆77Updated last year
- Quark is a data virtualization engine over analytic databases.☆98Updated 7 years ago
- Distributed, streaming anomaly detection and prediction with HTM in Apache Flink☆135Updated 7 years ago