Slurm: A Highly Scalable Workload Manager
☆3,795Mar 14, 2026Updated this week
Alternatives and similar repositories for slurm
Users that are interested in slurm are comparing it to the libraries listed below
Sorting:
- MUNGE (MUNGE Uid 'N' Gid Emporium) is an authentication service for creating and validating user credentials.☆300Feb 28, 2026Updated 2 weeks ago
- Open source web interface for Slurm HPC & AI clusters☆547Mar 2, 2026Updated 2 weeks ago
- Python Interface to Slurm☆558Updated this week
- An HPC workload manager and job scheduler for desktops, clusters, and clouds.☆787Feb 11, 2026Updated last month
- My tools for the Slurm HPC workload manager☆570Updated this week
- Open MPI main development repository☆2,543Mar 10, 2026Updated last week
- Lmod: An Environment Module System based on Lua, Reads TCL Modules, Supports a Software Hierarchy☆581Updated this week
- Container plugin for Slurm Workload Manager☆419Feb 18, 2026Updated last month
- OpenPMIx Project Repository☆257Mar 11, 2026Updated last week
- Singularity has been renamed to Apptainer as part of us moving the project to the Linux Foundation. This repo has been persisted as a sna…☆2,608Oct 10, 2022Updated 3 years ago
- LBNL Node Health Check☆275Apr 18, 2025Updated 11 months ago
- A Slurm cluster using docker-compose☆474Updated this week
- Apptainer: Application containers for Linux☆1,765Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.☆41,773Updated this week
- Slurm on Google Cloud Platform☆190Sep 18, 2024Updated last year
- Prometheus exporter for performance metrics from Slurm.☆275Jun 20, 2024Updated last year
- A flexible package manager that supports multiple versions, configurations, platforms, and compilers.☆4,970Mar 11, 2026Updated last week
- Optimized primitives for collective multi-GPU communication☆4,513Mar 8, 2026Updated last week
- A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.☆913Feb 18, 2026Updated last month
- OpenHPC Integration, Packaging, and Test Repo☆975Updated this week
- Environment Modules: provides dynamic modification of a user's environment☆835Feb 17, 2026Updated last month
- Tools for building GPU clusters☆1,424Feb 23, 2026Updated 3 weeks ago
- A Cloud Native Batch System (Project under CNCF)☆5,381Mar 11, 2026Updated last week
- Steps to create a small slurm cluster with GPU enabled nodes☆271Feb 2, 2023Updated 3 years ago
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,679Dec 1, 2025Updated 3 months ago
- NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs☆681Feb 17, 2026Updated last month
- SingularityCE is the Community Edition of Singularity, an open source container platform designed to be simple, fast, and secure.☆939Mar 11, 2026Updated last week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more☆35,108Updated this week
- Supercomputing. Seamlessly. Open, Interactive HPC Via the Web☆438Updated this week
- NVIDIA device plugin for Kubernetes☆3,699Updated this week
- Official MPICH Repository☆665Mar 10, 2026Updated last week
- Development repository for the Triton language and compiler☆18,656Updated this week
- Singularity implementation of k8s operator for interacting with SLURM.☆117Dec 29, 2020Updated 5 years ago
- Ansible role for installing and managing the Slurm Workload Manager☆115Nov 24, 2025Updated 3 months ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆41,807Updated this week
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆10,426Updated this week
- Machine Learning Toolkit for Kubernetes☆15,519Jan 5, 2026Updated 2 months ago
- Parallel computing with task scheduling☆13,765Updated this week
- Shifter - Linux Containers for HPC☆361Oct 25, 2025Updated 4 months ago