TileDB-Inc / TileDB-Spark
Spark interface to the TileDB storage manager [please see README]
☆16Updated 4 months ago
Alternatives and similar repositories for TileDB-Spark
Users that are interested in TileDB-Spark are comparing it to the libraries listed below
Sorting:
- ☆105Updated last year
- ☆39Updated 6 years ago
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆21Updated last year
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆50Updated last year
- Drizzle integration with Apache Spark☆120Updated 6 years ago
- Apache datasketches☆95Updated 2 years ago
- Utility for benchmarking changes in Spark using TPC-DS workloads☆16Updated 3 years ago
- Miscellaneous functionality for manipulating Apache Spark RDDs.☆22Updated 6 years ago
- Parameter Server implementation in Apache Flink.☆14Updated 7 years ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated 2 years ago
- JVM integration for Weld☆16Updated 6 years ago
- Cache File System optimized for columnar formats and object stores☆182Updated 2 years ago
- A composable framework for fast and scalable data analytics☆57Updated 2 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆127Updated 4 months ago
- Albis: High-Performance File Format for Big Data Systems☆21Updated 6 years ago
- The Internals of PySpark☆26Updated 4 months ago
- Point-in-Time optimizations for Apache Spark☆30Updated last year
- Enabling Spark Optimization through Cross-stack Monitoring and Visualization☆47Updated 7 years ago
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆58Updated 4 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Updated 4 years ago
- Fast I/O plugins for Spark☆41Updated 4 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- Yggdrasil: Faster Decision Trees Using Column Partitioning in Spark☆31Updated 7 years ago
- Self regulation and auto-tuning for distributed system☆65Updated last year
- Spark SQL index for Parquet tables☆134Updated 4 years ago
- XGBoost GPU accelerated on Spark example applications☆52Updated 2 years ago
- Spark Terasort☆122Updated 2 years ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆89Updated last week
- Splittable Gzip codec for Hadoop☆70Updated last month
- Provides GPU awareness to Spark, Contact: @kmadhugit and @kiszk☆171Updated 6 years ago