oap-project / oap-mllib
Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.
☆20Updated last week
Related projects ⓘ
Alternatives and complementary repositories for oap-mllib
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.☆315Updated 3 months ago
- A tool and library for easily deploying applications on Apache YARN☆142Updated 8 months ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆256Updated last year
- Point-in-Time optimizations for Apache Spark☆29Updated 9 months ago
- A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.☆126Updated this week
- A S3 Shuffle plugin for Apache Spark to enable elastic scaling for generic Spark workloads.☆38Updated 6 months ago
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated last year
- Python - Java/Scala API for the Hopsworks feature store☆53Updated last week
- The Internals of Delta Lake☆182Updated last month
- Read and write Tensorflow TFRecord data from Apache Spark.☆290Updated 6 months ago
- A re-implementation of Hadoop DistCP in Apache Spark☆44Updated 10 months ago
- Train TensorFlow models on YARN in just a few lines of code!☆86Updated last year
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆84Updated 7 months ago
- All the things about TPC-DS in Apache Spark☆104Updated last year
- FeatHub - A stream-batch unified feature store for real-time machine learning☆315Updated 5 months ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- Jupyter extensions for SWAN☆58Updated this week
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆20Updated 7 months ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆56Updated last year
- A tool to get better debug info on spark's memory usage☆42Updated 5 years ago
- Spark SQL index for Parquet tables☆132Updated 3 years ago
- Parameter Server implementation in Apache Flink☆57Updated 6 years ago
- ☆104Updated last year
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆297Updated 10 months ago
- ACID Data Source for Apache Spark based on Hive ACID☆97Updated 3 years ago
- Spark Shuffle Optimization with RDMA+AEP☆30Updated last year
- Spark Structured Streaming State Tools☆34Updated 4 years ago