NetEase / lakehouse-benchmark
A benchmark tool for lakehouses.
☆11Updated last year
Related projects ⓘ
Alternatives and complementary repositories for lakehouse-benchmark
- JVM integration for Weld☆16Updated 6 years ago
- A table schema-less OLAP Analytics Engine for Big Data.☆24Updated 7 months ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Updated last year
- Transporter for integrating OpenLineage with OpenMetadata☆12Updated 8 months ago
- Serializable ACID transactions on streaming data☆23Updated 2 years ago
- ☆18Updated 6 months ago
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆69Updated this week
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆54Updated 4 years ago
- Lakehouse storage system benchmark☆66Updated last year
- This repository contains the code base for the Open Stream Processing Benchmark.☆50Updated 3 years ago
- Idempotent query executor☆50Updated 10 months ago
- ☆35Updated 2 years ago
- Peel is a framework that helps you to define, execute, analyze, and share experiments for distributed systems and algorithms.☆27Updated 2 years ago
- A streaming key-value store implementation using native Flink Streaming operators☆22Updated 9 years ago
- This repository provides Scotty, a framework for efficient window aggregations for out-of-order Stream Processing.☆75Updated last year
- Spark Structured Streaming State Tools☆34Updated 4 years ago
- Alerting and monitoring tool for Apache Spark☆22Updated 2 years ago
- Condor allows for the specification of synopsis-based streaming jobs on top of general dataflow systems. Condor provides a collection of …☆13Updated 5 months ago
- Mirror of Apache MRQL (Incubating)☆17Updated 7 years ago
- A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.☆21Updated this week
- A tool to get better debug info on spark's memory usage☆42Updated 5 years ago
- ☆104Updated last year
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Updated 2 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- Spark interface to the TileDB storage manager [please see README]☆15Updated last month
- ☆18Updated 2 years ago
- LinkedIn's version of Apache Calcite☆22Updated 2 weeks ago
- Point-in-Time optimizations for Apache Spark☆29Updated 10 months ago
- BigQuery integration to Apache Flink's Table API☆23Updated this week
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago