oap-project / pmem-shuffleLinks
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.
☆14Updated last year
Alternatives and similar repositories for pmem-shuffle
Users that are interested in pmem-shuffle are comparing it to the libraries listed below
Sorting:
- Spark Shuffle Optimization with RDMA+AEP☆30Updated 2 years ago
- All the things about TPC-DS in Apache Spark☆106Updated 2 years ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated 2 years ago
- Mirror of Apache crail (Incubating)☆150Updated 2 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆127Updated 6 months ago
- Naos: Serialization-free RDMA networking in Java☆15Updated 3 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆50Updated last year
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆257Updated 2 years ago
- TPC-H queries in Apache Spark SQL using native DataFrames API☆99Updated last year
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆250Updated 6 years ago
- Use the TPC-DS benchmark to test Spark SQL performance☆180Updated 5 years ago
- Performance Analysis Tool☆76Updated 3 weeks ago
- Java bindings for https://github.com/facebookincubator/velox☆27Updated this week
- Star Schema Benchmark dbgen☆124Updated last year
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆113Updated 3 years ago
- A modular acceleration toolkit for big data analytic engines☆68Updated last year
- Lakehouse storage system benchmark☆75Updated 2 years ago
- A hadoop compatible FUSE use for all.☆29Updated 9 months ago
- Spark Terasort☆121Updated 2 years ago
- DS2 is an auto-scaling controller for distributed streaming dataflows☆89Updated 2 years ago
- This repository contains the code base for the Open Stream Processing Benchmark.☆51Updated 3 years ago
- An Extensible Data Skipping Framework☆47Updated 5 months ago
- Gluten: Plugin to Boost Trino's Performance☆71Updated last year
- ☆35Updated last year
- A Benchmark Harness for Systematic and Robust Evaluation of Streaming State Stores☆17Updated last year
- stream processing reading list☆69Updated 2 years ago
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆21Updated last year
- The preview version of a spillable state backend for Apache Flink☆39Updated 4 years ago
- A Multicore, NUMA Optimised Data Stream Processing System☆38Updated 2 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆28Updated last year