oap-project / pmem-shuffleLinks
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.
☆14Updated 2 years ago
Alternatives and similar repositories for pmem-shuffle
Users that are interested in pmem-shuffle are comparing it to the libraries listed below
Sorting:
- TPC-H queries in Apache Spark SQL using native DataFrames API☆98Updated last year
- Mirror of Apache crail (Incubating)☆151Updated 3 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆130Updated last year
- Use the TPC-DS benchmark to test Spark SQL performance☆183Updated 5 years ago
- Lakehouse storage system benchmark☆77Updated 2 years ago
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆258Updated 6 years ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆258Updated 2 years ago
- This repository contains the code base for the Open Stream Processing Benchmark.☆55Updated 4 years ago
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆115Updated 4 years ago
- A modular acceleration toolkit for big data analytic engines☆67Updated last year
- All the things about TPC-DS in Apache Spark☆109Updated 2 years ago
- Trisk on Flink☆16Updated 3 years ago
- Naos: Serialization-free RDMA networking in Java☆17Updated 4 years ago
- Star Schema Benchmark dbgen☆125Updated last year
- TPC-DS benchmark kit with some modifications/fixes☆353Updated last year
- Spark Shuffle Optimization with RDMA+AEP☆30Updated 2 years ago
- Transactions for Stateful Functions as a Service. This repository implements and API and associated underpinnings for two-phase Commit an…☆25Updated 3 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆52Updated 2 years ago
- Gluten: Plugin to Boost Trino's Performance☆76Updated 2 years ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated 3 years ago
- Benchmarks for queries over continuous data streams.☆373Updated 3 weeks ago
- Community Java bindings for https://github.com/facebookincubator/velox☆37Updated this week
- Source code for TPCx-BB benchmark for Hive and SparkSQL on scale factor of 300 GB☆10Updated 7 years ago
- tpch-dbgen☆38Updated 13 years ago
- TPC-DS queries☆64Updated 10 years ago
- DS2 is an auto-scaling controller for distributed streaming dataflows☆91Updated 2 years ago
- Generic driver for LDBC Graphalytics implementation☆86Updated last year
- A Multicore, NUMA Optimised Data Stream Processing System☆40Updated 3 years ago
- A Benchmark Harness for Systematic and Robust Evaluation of Streaming State Stores☆17Updated last year
- Condor allows for the specification of synopsis-based streaming jobs on top of general dataflow systems. Condor provides a collection of …☆13Updated last year