GoogleCloudPlatform / slurm-gcpLinks
☆49Updated this week
Alternatives and similar repositories for slurm-gcp
Users that are interested in slurm-gcp are comparing it to the libraries listed below
Sorting:
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆264Updated this week
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆123Updated last week
- ☆34Updated this week
- ☆36Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆46Updated this week
- Tools to deploy GPU clusters in the Cloud☆31Updated 2 years ago
- Repository of machine learning benchmarks☆36Updated 2 weeks ago
- ☆141Updated 2 weeks ago
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆27Updated last month
- ☆14Updated 4 years ago
- ☆18Updated this week
- Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake usin…☆26Updated 3 months ago
- train with kittens!☆59Updated 8 months ago
- ☆25Updated this week
- Dragon distributed runtime for HPC and AI applications and workflows☆72Updated last month
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆73Updated this week
- ☆43Updated 4 months ago
- Example ML projects that use the Determined library.☆32Updated 9 months ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆112Updated this week
- A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across differe…☆48Updated this week
- Dev repo for power measurement for the MLPerf™ benchmarks☆24Updated 2 months ago
- ☆21Updated this week
- Azure CycleCloud project to enable users to create, configure, and use Slurm HPC clusters.☆66Updated this week
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆94Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆349Updated 2 weeks ago
- General policies for MLPerf™ including submission rules, coding standards, etc.☆28Updated this week
- Slurm on Google Cloud Platform☆187Updated 9 months ago
- Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)☆64Updated last week
- Notes and artifacts from the ONNX steering committee☆26Updated 2 weeks ago
- Deploy your HPC Cluster on AWS in 20min. with just 1-Click.☆64Updated last year