microsoft / Peregrine
Peregrine is a workload optimization platform for cloud query engines. The goal of Peregrine is three-fold: 1. make it easier to ingest and analyze query workload telemetry into a common engine-agnostic representation, 2. help developers to quickly build workload optimization applications to reduce overall costs and improve operational efficien…
☆22Updated 4 years ago
Alternatives and similar repositories for Peregrine:
Users that are interested in Peregrine are comparing it to the libraries listed below
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆74Updated last week
- Mirror of Apache crail (Incubating)☆150Updated 2 years ago
- Lakehouse storage system benchmark☆73Updated 2 years ago
- A modular acceleration toolkit for big data analytic engines☆68Updated 11 months ago
- Self regulation and auto-tuning for distributed system☆65Updated last year
- The DSB benchmark is designed for evaluating both workloaddriven and traditional database systems on modern decision support workloads. D…☆53Updated 5 months ago
- TPC-DS queries☆60Updated 9 years ago
- TPC-H queries in Apache Spark SQL using native DataFrames API☆99Updated last year
- tpch-dbgen☆38Updated 12 years ago
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆112Updated 3 years ago
- Performance Analysis Tool☆76Updated 2 years ago
- ☆84Updated this week
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆425Updated 3 years ago
- Spark Terasort☆122Updated 2 years ago
- Java bindings for https://github.com/facebookincubator/velox☆21Updated this week
- Transactions for Stateful Functions as a Service. This repository implements and API and associated underpinnings for two-phase Commit an…☆25Updated 2 years ago
- DS2 is an auto-scaling controller for distributed streaming dataflows☆89Updated 2 years ago
- Albis: High-Performance File Format for Big Data Systems☆21Updated 6 years ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆257Updated 2 years ago
- ☆23Updated last month
- [Archived] A Fast Multi-tiered Distributed Storage System based on User-Level I/O☆72Updated 7 years ago
- A stateful serverless platform☆240Updated 2 years ago
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆245Updated 5 years ago
- BI benchmark with user generated data and queries☆65Updated 4 months ago
- Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote pe…☆14Updated last year
- Measuring the performance of popular streaming engines with Yahoo's Streaming Benchmark☆53Updated 5 years ago
- Elastic ephemeral storage☆119Updated 3 years ago
- Cache File System optimized for columnar formats and object stores☆182Updated 2 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆127Updated 4 months ago
- This repository contains the code base for the Open Stream Processing Benchmark.☆50Updated 3 years ago