ULHPC / puppet-slurm
A Puppet module designed to configure and manage SLURM(see https://slurm.schedmd.com/), an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters
☆19Updated 2 weeks ago
Alternatives and similar repositories for puppet-slurm:
Users that are interested in puppet-slurm are comparing it to the libraries listed below
- A Slurm-based HPC workload management environment, driven by Ansible.☆52Updated this week
- Prometheus exporter for slurm job/node data☆33Updated 5 months ago
- Dynamic Registry Proxy☆15Updated last year
- Ansible role for OpenHPC☆47Updated last month
- Bare Metal Provisioning system for HPC Linux clusters☆58Updated this week
- Prometheus exporter for the stats in the cgroup accounting with slurm. This will also collect stats of a job using NVIDIA GPUs.☆27Updated 5 months ago
- A daemon that uses cgroups to monitor and manage user behavior on login nodes☆62Updated 5 months ago
- A coherent Ansible roles collection to simply deploy clusters of nodes.☆120Updated 2 weeks ago
- Fluxion Graph-based Scheduler☆91Updated last week
- Puppet module for SLURM client and server☆15Updated 2 years ago
- Command-line tool to retrieve information and monitor Mellanox un-managed Infiniband switches☆53Updated last month
- Slurm Lua SPANK plugin☆16Updated 2 years ago
- SLURM Bank, a collection of wrapper scripts to give slurm GOLD like capabilities for managing resources.☆24Updated 6 years ago
- A quick and dirty rest interface to the slurm api and commands.☆11Updated 7 years ago
- InfiniBand fabric monitoring daemon written in Go☆30Updated 10 months ago
- Ansible playbook for OpenHPC☆24Updated 5 years ago
- HPC dashboards developed for SRCC systems☆18Updated 3 years ago
- HPC tests using MPI codes & synthetic benchmarks with IB/RoCE comparisions - from StackHPC Ltd.☆19Updated 2 years ago
- ☆13Updated 3 years ago
- Export select slurm metrics to prometheus☆47Updated this week
- Monitoring and visualization of InfiniBand Fabrics☆20Updated 3 years ago
- Warewulf is a scalable systems management suite originally developed to manage large high-performance Linux clusters.☆107Updated 9 months ago
- Create beegfs server and client☆24Updated 3 years ago
- Slurm SPANK plugin to let users change GPU compute mode in jobs☆12Updated last year
- ☆27Updated 8 months ago
- Kraken is a distributed state engine framework for scalable automation and orchestration tools.☆55Updated last year
- server for storage and management of singularity images☆104Updated 6 months ago
- Prometheus exporter for use with the Lustre parallel filesystem☆21Updated 2 months ago
- LBNL Node Health Check☆241Updated 2 weeks ago
- A tool to generate slurm topology configuration from infiniband network discovery.☆21Updated 8 years ago