glennklockwood / myhadoopLinks
Framework for deploying Hadoop clusters on traditional HPC from userland
☆45Updated 7 years ago
Alternatives and similar repositories for myhadoop
Users that are interested in myhadoop are comparing it to the libraries listed below
Sorting:
- Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for L…☆199Updated 5 months ago
- Default Repo description from terraform module☆5Updated 5 years ago
- Reference service implementation of the HDF5 REST API☆170Updated 2 years ago
- High-level python wapper for the Sun Grid Engine (SGE) using DRMAA and ZMQ☆21Updated 11 years ago
- ☆60Updated 3 years ago
- Documented examples of Jupyterhub deployment in HPC settings☆36Updated last year
- Python-based viewer for HDF5 and other HDF5-like file formats☆131Updated 6 months ago
- Create clusters of VMs on the cloud and configure them with Ansible.☆337Updated last year
- Custom Spawner for Jupyterhub to start servers in batch scheduled systems☆199Updated last month
- launching and controlling spark on hpc clusters☆23Updated 3 years ago
- Scientific Spark - a NASA AIST14 project☆85Updated 7 years ago
- Deploy Dask on DRMAA clusters☆40Updated 4 years ago
- Supporting Hierarchical Data Format and Rich Parallel I/O Interface in Spark☆42Updated 4 years ago
- Remote Spawner class for JupyterHub to spawn IPython notebooks and a remote server and tunnel the port via SSH☆26Updated 9 years ago
- hanythingondemand provides a set of scripts to easily set up an ad-hoc Hadoop cluster through PBS jobs☆12Updated 6 years ago
- HDF5 Tutorial☆105Updated 11 years ago
- h5py distributed - Python client library for HDF Rest API☆121Updated last week
- Specification and tools for representing HDF5 in JSON☆80Updated last week
- ☆100Updated 10 months ago
- Custom Spawner for Jupyterhub to start slurm jobs when users log in☆24Updated 3 years ago
- VisTrails is an open-source data analysis and visualization tool. It provides a comprehensive provenance infrastructure that maintains de…☆104Updated 7 years ago
- StarCluster is an open source cluster-computing toolkit for Amazon's Elastic Compute Cloud (EC2).☆581Updated 3 years ago
- XALT: System tracking of users codes on clusters☆45Updated last month
- files and instructions for creating and using example containers from the sylabs.io blog☆105Updated 2 years ago
- Kira is an astronomy image processing toolkit implemented with Apache Spark.☆15Updated 9 years ago
- Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX☆220Updated 4 years ago
- Task scheduling and blocked algorithms for parallel processing☆17Updated 3 weeks ago
- Demo notebooks inside a docker for end-to-end examples☆113Updated 7 years ago
- Scalable dynamic library and python loading in HPC environments☆102Updated last week
- Jupyter plugin that provides a tab for TACC Lmod (https://github.com/TACC/Lmod)☆32Updated 2 weeks ago