☆64Apr 14, 2026Updated this week
Alternatives and similar repositories for slurm-gcp
Users that are interested in slurm-gcp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆333Apr 10, 2026Updated last week
- Slurm on Google Cloud Platform☆190Sep 18, 2024Updated last year
- Slurm Exporter for Prometheus☆18Aug 6, 2024Updated last year
- ☆17Apr 3, 2026Updated 2 weeks ago
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆130Updated this week
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- UbiOps Tutorials☆15Mar 25, 2026Updated 3 weeks ago
- SLURM Tools and UBiLities☆74Aug 1, 2022Updated 3 years ago
- Automatically exported from code.google.com/p/hpc-workspace☆37Jan 19, 2026Updated 3 months ago
- A testing framework and a set of test suites used for testing GCE Images.☆16Apr 8, 2026Updated last week
- Standard interface for collecting HPC run metadata☆16Nov 7, 2025Updated 5 months ago
- Estimate MFU for DeepSeekV3☆26Jan 5, 2025Updated last year
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Container plugin for Slurm Workload Manager☆428Updated this week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆151Apr 10, 2026Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆20Oct 23, 2023Updated 2 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Apr 9, 2026Updated last week
- ☆37Aug 12, 2025Updated 8 months ago
- This repository contains the results and code for the MLPerf™ Training v2.1 benchmark.☆15Aug 9, 2023Updated 2 years ago
- A multi-platform experimentation framework written in python.☆68Updated this week
- Tool to prep huge data volumes for place in archives like Glacier, Data Den, or HPSS☆30Updated this week
- Dynamic execution environments for coupled, thread-heterogeneous MPI+X applications☆21Mar 3, 2025Updated last year
- ☆18Jan 3, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Simple python library for generating your own perfetto traces for your application. Can be used for both app instrumentation and custom …☆25Jun 22, 2025Updated 9 months ago
- A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.☆927Updated this week
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆64Dec 19, 2025Updated 4 months ago
- linear algebra package. like gonum/mat, but small. lets say gonum-lite☆12Jul 8, 2023Updated 2 years ago
- Slurm in Kubernetes☆43Nov 20, 2025Updated 4 months ago
- A JAX-native High Performance Eval Metrics Library☆56Updated this week
- HDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O o…☆21Feb 10, 2026Updated 2 months ago
- ☆14Jul 13, 2025Updated 9 months ago
- Help protect against malicious build scripts☆28Updated this week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Performance portable equations of state and mixed cell closures☆34Apr 12, 2026Updated last week
- Simple HPC queuing system adapter for Python on based jinja templates to automate the submission script creation.☆34Updated this week
- ☆23Aug 14, 2024Updated last year
- AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.☆887Updated this week
- idevicerestore, but with support for virtual devices☆12Jan 27, 2024Updated 2 years ago
- OpenMP vs Offload☆23Jun 2, 2023Updated 2 years ago
- A platform for managing machine learning experiments☆905Mar 31, 2026Updated 2 weeks ago