TU-Berlin-DIMA / myriad-toolkitLinks
Myriad Parallel Data Generator Toolkit
☆21Updated 11 years ago
Alternatives and similar repositories for myriad-toolkit
Users that are interested in myriad-toolkit are comparing it to the libraries listed below
Sorting:
- Running TPC-H on Apache Hive☆41Updated 6 years ago
- An open-source, vendor-neutral data context service.☆161Updated 7 years ago
- An efficient updatable key-value store for Apache Spark☆254Updated 8 years ago
- Spark SQL index for Parquet tables☆134Updated 4 years ago
- Live-updating Spark UI built with Meteor☆189Updated 4 years ago
- Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing☆252Updated 5 years ago
- HopsWorks - Hadoop for Humans☆117Updated 6 years ago
- The SpliceSQL Engine☆171Updated 2 years ago
- Quark is a data virtualization engine over analytic databases.☆100Updated 8 years ago
- BlinkDB: Sub-Second Approximate Queries on Very Large Data.☆659Updated 11 years ago
- An experimental Graph Streaming API for Apache Flink☆141Updated 5 years ago
- A super simple utility for testing Apache Hive scripts locally for non-Java developers.☆73Updated 8 years ago
- Enabling queries on compressed data.☆282Updated 2 years ago
- Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.☆114Updated last year
- Cache File System optimized for columnar formats and object stores☆187Updated 3 years ago
- An extension of Yahoo's Benchmarks☆109Updated 2 years ago
- Drizzle integration with Apache Spark☆120Updated 7 years ago
- TPC-DS benchmark kit with some modifications/additions☆10Updated 10 years ago
- Apache Flink cluster deployment in Docker containers using Docker-Compose☆18Updated 10 years ago
- Mirror of Apache Samoa (Incubating)☆250Updated 2 years ago
- ☆107Updated 2 years ago
- Website for DataSketches.☆108Updated this week
- Generates more or less realistic log data for testing simple aggregation queries.☆263Updated 2 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 6 years ago
- Flink performance tests☆20Updated 10 years ago
- Provides GPU awareness to Spark, Contact: @kmadhugit and @kiszk☆172Updated 7 years ago
- Port of TPC-DS dsdgen to Java☆50Updated last year
- Scripts to analyze Spark's performance☆136Updated 7 years ago
- ☆97Updated this week
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆59Updated 5 years ago