LLNL / spark-hdf5Links

A plugin to enable Apache Spark to read HDF5 files

☆20

Alternatives and similar repositories for spark-hdf5

Users that are interested in spark-hdf5 are comparing it to the libraries listed below

Sorting:

valiantljk / h5spark
Supporting Hierarchical Data Format and Rich Parallel I/O Interface in Spark
☆42Updated 4 years ago
rokroskar / sparkhpc
launching and controlling spark on hpc clusters
☆23Updated 3 years ago
SciDriver / spark-mpi
MPI-oriented extension of the Spark computational model
☆24Updated 7 years ago
WhiteFangBuck / CDSW-DL
Set up tools for running a few DL libraries on CDH and CDSW
☆17Updated 5 years ago
llnl / magpie
Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for L…
☆196Updated 11 months ago
jupyterhub / jupyterhub-example-kerberos
Example for experimenting with how JupyterHub can be configured to work with Kerberos
☆33Updated 8 years ago
SciSpark / SciSpark
Scientific Spark - a NASA AIST14 project
☆86Updated 7 years ago
kiszk / spark-gpu
Spark GPU and SIMD Support
☆61Updated 5 years ago
animeshtrivedi / ArrowExample
Java read and write example for Apache Arrow
☆34Updated 8 years ago
marcbux / Hi-WAY
Heterogeneity-incorporating Workflow ApplicationMaster for YARN
☆26Updated 8 years ago
CODAIT / stocator
Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
☆114Updated last year
open-mpi / ompi-java-test
Some tests / examples for Open MPI's Java MPI bindings
☆13Updated 7 years ago
jupyterhub / yarnspawner
Spawn JupyterHub single user notebook servers in Hadoop/YARN containers.
☆19Updated 9 months ago
mcapuccini / MaRe
MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
☆14Updated 3 years ago
IBMSparkGPU / GPUEnabler
Provides GPU awareness to Spark, Contact: @kmadhugit and @kiszk
☆172Updated 7 years ago
ddcap / halvade
New url: https://github.com/biointec/halvade
☆19Updated 8 years ago
hammerlab / magic-rdds
Miscellaneous functionality for manipulating Apache Spark RDDs.
☆22Updated 7 years ago
fulcrumgenomics / dagr
A scala based DSL and framework for writing and executing bioinformatics pipelines as Directed Acyclic GRaphs
☆69Updated 3 years ago
cylondata / twister2
A composable framework for fast and scalable data analytics
☆57Updated 3 years ago
Intel-bigdata / Spark-PMoF
Spark Shuffle Optimization with RDMA+AEP
☆30Updated 2 years ago
ehiggs / spark-terasort
Spark Terasort
☆121Updated 2 years ago
rapidsai / spark-examples
[ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples
☆72Updated 5 years ago
openucx / sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
☆52Updated 2 years ago
BD2KGenomics / conductor
Efficient, distributed downloads of large files from S3 to HDFS using Spark.
☆17Updated 8 years ago
BryanCutler / SparkArrowFlight
Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients
☆37Updated 4 years ago
MemVerge / splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
☆130Updated last year
diana-hep / spark-root
Apache Spark Data Source for ROOT File Format
☆29Updated 6 years ago
lasersonlab / ndarray.scala
N-dimensional arrays, with Zarr and HDF5 integrations
☆19Updated 6 years ago
zrlio / crail-spark-io
Fast I/O plugins for Spark
☆41Updated 5 years ago
TileDB-Inc / TileDB-Java
Java JNI interface to the TileDB Arrays storage and query engine
☆26Updated 2 weeks ago