oap-project / pmem-shuffleLinks
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.
☆14Updated 2 years ago
Alternatives and similar repositories for pmem-shuffle
Users that are interested in pmem-shuffle are comparing it to the libraries listed below
Sorting:
- TPC-H queries in Apache Spark SQL using native DataFrames API☆97Updated last year
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆128Updated 9 months ago
- Mirror of Apache crail (Incubating)☆150Updated 3 years ago
- Naos: Serialization-free RDMA networking in Java☆17Updated 4 years ago
- Lakehouse storage system benchmark☆76Updated 2 years ago
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆253Updated 6 years ago
- Use the TPC-DS benchmark to test Spark SQL performance☆181Updated 5 years ago
- A modular acceleration toolkit for big data analytic engines☆67Updated last year
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆257Updated 2 years ago
- This repository contains the code base for the Open Stream Processing Benchmark.☆52Updated 3 years ago
- All the things about TPC-DS in Apache Spark☆107Updated 2 years ago
- TPC-DS benchmark kit with some modifications/fixes☆342Updated last year
- Star Schema Benchmark dbgen☆125Updated last year
- High Performance Network Library for RDMA☆27Updated 2 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆51Updated last year
- Java bindings for https://github.com/facebookincubator/velox☆33Updated this week
- Spark Terasort☆121Updated 2 years ago
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆115Updated 3 years ago
- Performance Analysis Tool☆77Updated 3 months ago
- Benchmarks for queries over continuous data streams.☆359Updated 9 months ago
- Spark Shuffle Optimization with RDMA+AEP☆30Updated 2 years ago
- Generic driver for LDBC Graphalytics implementation☆84Updated 9 months ago
- Trisk on Flink☆16Updated 3 years ago
- Source code for TPCx-BB benchmark for Hive and SparkSQL on scale factor of 300 GB☆10Updated 7 years ago
- Gluten: Plugin to Boost Trino's Performance☆76Updated last year
- stream processing reading list☆69Updated 2 years ago
- Smart Storage Management for Big Data, a comprehensive hot/cold data optimized solution☆141Updated 2 years ago
- DiSNI: Direct Storage and Networking Interface☆191Updated 2 years ago
- "GraphOne: A Data Store for Real-time Analytics on Evolving Graphs", Usenix FAST'19☆59Updated 4 years ago
- A hadoop compatible FUSE use for all.☆29Updated last year