☆63Mar 26, 2026Updated this week
Alternatives and similar repositories for slurm-gcp
Users that are interested in slurm-gcp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆327Updated this week
- ☆13Jun 18, 2024Updated last year
- Slurm on Google Cloud Platform☆190Sep 18, 2024Updated last year
- Slurm Exporter for Prometheus☆18Aug 6, 2024Updated last year
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆123Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- UbiOps Tutorials☆15Updated this week
- SLURM Tools and UBiLities☆74Aug 1, 2022Updated 3 years ago
- Automatically exported from code.google.com/p/hpc-workspace☆37Jan 19, 2026Updated 2 months ago
- A testing framework and a set of test suites used for testing GCE Images.☆16Mar 21, 2026Updated last week
- Standard interface for collecting HPC run metadata☆16Nov 7, 2025Updated 4 months ago
- Estimate MFU for DeepSeekV3☆26Jan 5, 2025Updated last year
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Container plugin for Slurm Workload Manager☆423Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆150Updated this week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆20Oct 23, 2023Updated 2 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Mar 18, 2026Updated last week
- ☆37Aug 12, 2025Updated 7 months ago
- This repository contains the results and code for the MLPerf™ Training v2.1 benchmark.☆15Aug 9, 2023Updated 2 years ago
- Tool to prep huge data volumes for place in archives like Glacier, Data Den, or HPSS☆30Dec 23, 2025Updated 3 months ago
- A multi-platform experimentation framework written in python.☆68Mar 18, 2026Updated last week
- Dynamic execution environments for coupled, thread-heterogeneous MPI+X applications☆21Mar 3, 2025Updated last year
- Simple python library for generating your own perfetto traces for your application. Can be used for both app instrumentation and custom …☆25Jun 22, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.☆919Updated this week
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆64Dec 19, 2025Updated 3 months ago
- A flat container abstraction for Rust☆16Nov 24, 2025Updated 4 months ago
- A JAX-native High Performance Eval Metrics Library☆58Feb 3, 2026Updated last month
- HDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O o…☆21Feb 10, 2026Updated last month
- Help protect against malicious build scripts☆27Mar 21, 2026Updated last week
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆23Nov 2, 2025Updated 4 months ago
- Simple HPC queuing system adapter for Python on based jinja templates to automate the submission script creation.☆34Updated this week
- ☆23Aug 14, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.☆884Updated this week
- idevicerestore, but with support for virtual devices☆11Jan 27, 2024Updated 2 years ago
- This is a repository with examples to run inference endpoints on various ALCF clusters☆27Feb 3, 2026Updated last month
- OpenMP vs Offload☆23Jun 2, 2023Updated 2 years ago
- Ansible role for installing and managing the Slurm Workload Manager☆116Nov 24, 2025Updated 4 months ago
- A platform for managing machine learning experiments☆902Mar 3, 2026Updated 3 weeks ago
- ☆49Jan 5, 2026Updated 2 months ago