microsoft / PeregrineLinks
Peregrine is a workload optimization platform for cloud query engines. The goal of Peregrine is three-fold: 1. make it easier to ingest and analyze query workload telemetry into a common engine-agnostic representation, 2. help developers to quickly build workload optimization applications to reduce overall costs and improve operational efficien…
☆22Updated 5 years ago
Alternatives and similar repositories for Peregrine
Users that are interested in Peregrine are comparing it to the libraries listed below
Sorting:
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆430Updated 3 years ago
- Mirror of Apache crail (Incubating)☆150Updated 3 years ago
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆88Updated last month
- MLOS is a project to enable autotuning for systems.☆167Updated this week
- TPC-H queries in Apache Spark SQL using native DataFrames API☆98Updated last year
- Code for Ernest☆33Updated 2 years ago
- A stateful serverless platform☆245Updated 2 years ago
- Performance Analysis Tool☆78Updated 3 weeks ago
- Window-Based Hybrid CPU/GPU Stream Processing Engine☆42Updated 3 years ago
- Fast I/O plugins for Spark☆41Updated 5 years ago
- Self regulation and auto-tuning for distributed system☆67Updated 2 years ago
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆116Updated 4 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆51Updated 2 years ago
- Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing☆250Updated 4 years ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆258Updated 2 years ago
- tpch-dbgen☆38Updated 13 years ago
- A modular acceleration toolkit for big data analytic engines☆67Updated last year
- TPCDS benchmark for various engines☆18Updated 3 years ago
- Use the TPC-DS benchmark to test Spark SQL performance☆183Updated 5 years ago
- Lakehouse storage system benchmark☆77Updated 2 years ago
- Parquet file generator☆22Updated 7 years ago
- DS2 is an auto-scaling controller for distributed streaming dataflows☆91Updated 2 years ago
- Spark Terasort☆121Updated 2 years ago
- BI benchmark with user generated data and queries☆72Updated last year
- Transactions for Stateful Functions as a Service. This repository implements and API and associated underpinnings for two-phase Commit an…☆25Updated 3 years ago
- Point-in-Time optimizations for Apache Spark☆30Updated last year
- Albis: High-Performance File Format for Big Data Systems☆21Updated 7 years ago
- The DSB benchmark is designed for evaluating both workloaddriven and traditional database systems on modern decision support workloads. D…☆71Updated last year
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆130Updated last year
- Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.☆114Updated last year