oap-project / pmem-shuffle
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.
☆14Updated last year
Alternatives and similar repositories for pmem-shuffle:
Users that are interested in pmem-shuffle are comparing it to the libraries listed below
- A modular acceleration toolkit for big data analytic engines☆67Updated 9 months ago
- Mirror of Apache crail (Incubating)☆149Updated 2 years ago
- Naos: Serialization-free RDMA networking in Java☆14Updated 3 years ago
- TPC-H queries in Apache Spark SQL using native DataFrames API☆98Updated last year
- Spark Shuffle Optimization with RDMA+AEP☆30Updated last year
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆127Updated 2 months ago
- Lakehouse storage system benchmark☆70Updated last year
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆49Updated last year
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated 2 years ago
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆111Updated 3 years ago
- All the things about TPC-DS in Apache Spark☆104Updated last year
- Use the TPC-DS benchmark to test Spark SQL performance☆177Updated 4 years ago
- A Benchmark Harness for Systematic and Robust Evaluation of Streaming State Stores☆17Updated 9 months ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆257Updated 2 years ago
- Spark Terasort☆123Updated last year
- High Performance Network Library for RDMA☆27Updated 2 years ago
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆242Updated 5 years ago
- Transactions for Stateful Functions as a Service. This repository implements and API and associated underpinnings for two-phase Commit an…☆25Updated 2 years ago
- The preview version of a spillable state backend for Apache Flink☆39Updated 3 years ago
- Reducing the cache misses of SIMD vectorization using IMV☆27Updated 2 years ago
- Performance Analysis Tool☆76Updated 2 years ago
- Implementation of the algorithm described in "Hardware-conscious Hash-Joins on GPUs" paper presented in ICDE 2019☆33Updated 4 years ago
- GPU library for writing SQL queries☆70Updated 8 months ago
- Blaze: Fast Graph Processing on Fast SSDs (SC'22)☆10Updated 2 years ago
- A Skew-Resistant Index for Processing-in-Memory☆25Updated 4 months ago
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆21Updated 11 months ago
- ☆29Updated 4 months ago
- Gluten: Plugin to Boost Trino's Performance☆70Updated last year
- Star Schema Benchmark dbgen☆121Updated 11 months ago
- ☆32Updated 8 months ago