GoogleCloudPlatform / slurm-gcp
☆40Updated last week
Alternatives and similar repositories for slurm-gcp:
Users that are interested in slurm-gcp are comparing it to the libraries listed below
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆250Updated this week
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆117Updated last week
- Collection of scripts to build PyTorch and the domain libraries from source.☆10Updated this week
- Deploy your HPC Cluster on AWS in 20min. with just 1-Click.☆64Updated last year
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆26Updated this week
- ☆43Updated 3 months ago
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆60Updated 2 weeks ago
- Tools to deploy GPU clusters in the Cloud☆31Updated 2 years ago
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆110Updated last week
- Optimized primitives for collective multi-GPU communication☆9Updated last year
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆110Updated 3 weeks ago
- Testing if I can implement slurm in an operator☆14Updated 6 months ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆59Updated 3 weeks ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆94Updated this week
- Carbon Limiting Auto Tuning for Kubernetes☆37Updated 5 months ago
- ☆21Updated 2 months ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆100Updated this week
- Slurm on Google Cloud Platform☆184Updated 7 months ago
- A Slurm dashboard for the terminal.☆85Updated last year
- OCI-compatible engine to deploy Linux containers on HPC environments.☆136Updated 6 months ago
- Cloud Native Benchmarking of Foundation Models☆32Updated 6 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆44Updated last week
- Notes and artifacts from the ONNX steering committee☆26Updated last week
- ☆17Updated last week
- Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)☆64Updated 5 months ago
- Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake usin…☆25Updated last month
- Dragon distributed runtime for HPC and AI applications and workflows☆69Updated this week
- GPU Environment Management for Visual Studio Code☆38Updated last year
- Repository of machine learning benchmarks☆36Updated this week
- Container plugin for Slurm Workload Manager☆340Updated 6 months ago