LLNL / spark-hdf5Links
A plugin to enable Apache Spark to read HDF5 files
☆20Updated 9 years ago
Alternatives and similar repositories for spark-hdf5
Users that are interested in spark-hdf5 are comparing it to the libraries listed below
Sorting:
- Supporting Hierarchical Data Format and Rich Parallel I/O Interface in Spark☆42Updated 4 years ago
- launching and controlling spark on hpc clusters☆23Updated 3 years ago
- MPI-oriented extension of the Spark computational model☆24Updated 7 years ago
- Set up tools for running a few DL libraries on CDH and CDSW☆17Updated 5 years ago
- Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for L…☆196Updated 11 months ago
- Example for experimenting with how JupyterHub can be configured to work with Kerberos☆33Updated 8 years ago
- Scientific Spark - a NASA AIST14 project☆86Updated 7 years ago
- Spark GPU and SIMD Support☆61Updated 5 years ago
- Java read and write example for Apache Arrow☆34Updated 8 years ago
- Heterogeneity-incorporating Workflow ApplicationMaster for YARN☆26Updated 8 years ago
- Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.☆114Updated last year
- Some tests / examples for Open MPI's Java MPI bindings☆13Updated 7 years ago
- Spawn JupyterHub single user notebook servers in Hadoop/YARN containers.☆19Updated 9 months ago
- MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.☆14Updated 3 years ago
- Provides GPU awareness to Spark, Contact: @kmadhugit and @kiszk☆172Updated 7 years ago
- New url: https://github.com/biointec/halvade☆19Updated 8 years ago
- Miscellaneous functionality for manipulating Apache Spark RDDs.☆22Updated 7 years ago
- A scala based DSL and framework for writing and executing bioinformatics pipelines as Directed Acyclic GRaphs☆69Updated 3 years ago
- A composable framework for fast and scalable data analytics☆57Updated 3 years ago
- Spark Shuffle Optimization with RDMA+AEP☆30Updated 2 years ago
- Spark Terasort☆121Updated 2 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆72Updated 5 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆52Updated 2 years ago
- Efficient, distributed downloads of large files from S3 to HDFS using Spark.☆17Updated 8 years ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆37Updated 4 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆130Updated last year
- Apache Spark Data Source for ROOT File Format☆29Updated 6 years ago
- N-dimensional arrays, with Zarr and HDF5 integrations☆19Updated 6 years ago
- Fast I/O plugins for Spark☆41Updated 5 years ago
- Java JNI interface to the TileDB Arrays storage and query engine☆26Updated 2 weeks ago