NetEase / lakehouse-benchmark
A benchmark tool for lakehouses.
☆11Updated last year
Alternatives and similar repositories for lakehouse-benchmark:
Users that are interested in lakehouse-benchmark are comparing it to the libraries listed below
- A table schema-less OLAP Analytics Engine for Big Data.☆24Updated 8 months ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Updated 2 weeks ago
- Serializable ACID transactions on streaming data☆24Updated 2 years ago
- A library to support building a coherent set of flink jobs☆16Updated 3 months ago
- pulsar lakehouse connector☆31Updated this week
- On the fly, translation of Spark programs to run natively on your Oracle DB. Your Spark programs require no changes.☆31Updated last year
- Dione - a Spark and HDFS indexing library☆50Updated 9 months ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Updated 4 years ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Updated last year
- Lakehouse storage system benchmark☆66Updated last year
- A tool to get better debug info on spark's memory usage☆42Updated 5 years ago
- Condor allows for the specification of synopsis-based streaming jobs on top of general dataflow systems. Condor provides a collection of …☆13Updated 6 months ago
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆72Updated this week
- ☆36Updated 2 years ago
- All the things about TPC-DS in Apache Spark☆105Updated last year
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆57Updated 4 years ago
- LinkedIn's version of Apache Calcite☆22Updated 2 months ago
- Alerting and monitoring tool for Apache Spark☆23Updated 2 years ago
- Apache Iceberg Documentation Site☆41Updated 11 months ago
- Transporter for integrating OpenLineage with OpenMetadata☆12Updated 10 months ago
- Monitoring and insights on your data lakehouse tables☆27Updated 2 weeks ago
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆21Updated 10 months ago
- BigQuery integration to Apache Flink's Table API☆28Updated this week
- ☆104Updated last year
- Code snippets used in demos recorded for the blog.☆29Updated this week
- Processing videos on Apache Spark☆12Updated 2 years ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated 2 years ago
- JVM integration for Weld☆16Updated 6 years ago
- Testing Scala code with scalatest☆12Updated 2 years ago