lopentusska / slurm_ubuntu_gpu_clusterLinks
Instructions for setting up a Slurm gpu cluster on Ubuntu 22.04.
☆28Updated last year
Alternatives and similar repositories for slurm_ubuntu_gpu_cluster
Users that are interested in slurm_ubuntu_gpu_cluster are comparing it to the libraries listed below
Sorting:
- Instructions for setting up a SLURM cluster using Ubuntu 18.04.3 with GPUs.☆152Updated 4 years ago
- Container plugin for Slurm Workload Manager☆385Updated last week
- NVIDIA NCCL Tests for Distributed Training☆112Updated last week
- A Slurm cluster using docker-compose☆402Updated 3 weeks ago
- Jobstats is a job monitoring platform for CPU and GPU clusters☆97Updated this week
- My tools for the Slurm HPC workload manager☆542Updated 2 weeks ago
- Slurm in Docker - Exploring Slurm using CentOS 7 based Docker images☆129Updated 5 years ago
- Open source web interface for Slurm HPC & AI clusters☆494Updated 3 weeks ago
- A dummy's guide to setting up (and using) HPC clusters on Ubuntu 22.04LTS using Slurm and Munge. Created by the Quant Club @ UIowa.☆357Updated last year
- Slurm-Mail is a drop in replacement for Slurm's e-mails to give users much more information about their jobs compared to the standard Slu…☆112Updated this week
- Ansible role for installing and managing the Slurm Workload Manager☆108Updated 6 months ago
- Profiling with NVIDIA Nsight Tools Bootcamp☆14Updated 2 years ago
- Super Computing On Web☆303Updated this week
- ☆56Updated 10 months ago
- Prometheus exporter for performance metrics from Slurm.☆265Updated last year
- A tool for bandwidth measurements on NVIDIA GPUs.☆541Updated 5 months ago
- ☆315Updated last year
- NCCL Tests☆1,284Updated last week
- Determined AI public environments☆50Updated last year
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆132Updated 2 weeks ago
- Provide Python access to the NVML library for GPU diagnostics☆248Updated last month
- GPU Stress Test is a tool to stress the compute engine of NVIDIA Tesla GPU’s by running a BLAS matrix multiply using different data types…☆110Updated 3 months ago
- Benchmark Suite for Deep Learning☆275Updated 7 months ago
- ☆23Updated this week
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆810Updated 2 weeks ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated 9 months ago
- oneCCL Bindings for Pytorch*☆102Updated 2 months ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆198Updated this week
- ☆45Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 3 months ago