AMD-AGI/maxtext-slurm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AMD-AGI/maxtext-slurm)

AMD-AGI / maxtext-slurm

Toolkit for launching and observing MaxText training on Slurm-managed GPU clusters

☆28

Alternatives and similar repositories for maxtext-slurm

Users that are interested in maxtext-slurm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AMD-AGI / Primus-SaFE
View on GitHub
Primus-SaFE(Stability and Fault Endurance)
☆56Updated this week
AMD-AGI / Primus-Turbo
View on GitHub
A high-performance acceleration library dedicated to large-scale model training on AMD GPUs
☆67Updated this week
OswaldHe / LevelST
View on GitHub
[FPGA 2024] Source code and bitstream for LevelST: Stream-based Accelerator for Sparse Triangular Solver
☆15Jun 1, 2025Updated last year
ROCm / omnistat
View on GitHub
Scale-out system monitoring
☆25Updated this week
AMD-AGI / TraceLens
View on GitHub
Automating analysis from trace files
☆81Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
google / CoMMA
View on GitHub
☆24Jun 29, 2026Updated 3 weeks ago
OswaldHe / HMT-pytorch
View on GitHub
[NAACL 2025] Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"
☆80Mar 12, 2026Updated 4 months ago
oxia-db / oxia-client-java
View on GitHub
Oxia Java client SDK
☆21Jul 14, 2026Updated last week
shaktsin / gpt2.c
View on GitHub
GPT2 Inference Implementation in Pure C
☆32Jun 23, 2025Updated last year
pascaldevink / awesome-pulsar
View on GitHub
A curated list of Apache Pulsar resources
☆13Oct 30, 2018Updated 7 years ago
gernest / hoodie
View on GitHub
pure zig language server with swagger and bling bling
☆20Sep 6, 2019Updated 6 years ago
vectorch-ai / ScaleLLM
View on GitHub
A high-performance inference system for large language models, designed for production environments.
☆499Dec 19, 2025Updated 7 months ago
chris-albert / zio4j
View on GitHub
Java wrapper for the ZIO scala library
☆11Mar 31, 2021Updated 5 years ago
Emad-H / Physical-Verification-using-SKY130
View on GitHub
Repository for VSD-IAT Workshop: Physical Verification using SKY130
☆10Aug 15, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
pgaref / ACaZoo
View on GitHub
A distributed key-value store based on replicated LSM-Trees
☆10May 18, 2017Updated 9 years ago
archfish / pulsar_sdk
View on GitHub
A pure ruby client for Apache Pulsar
☆13Oct 1, 2025Updated 9 months ago
vkulichenko / code-session
View on GitHub
☆12Nov 17, 2020Updated 5 years ago
vrischmann / zig-cassandra
View on GitHub
Cassandra CQL client
☆16Apr 13, 2026Updated 3 months ago
BaguaSys / bagua-net
View on GitHub
High performance NCCL plugin for Bagua.
☆15Sep 15, 2021Updated 4 years ago
keithrozario / S3-71
View on GitHub
Copy millions of objects in minutes
☆12Oct 21, 2019Updated 6 years ago
vrtnis / tycoon-learning-environment
View on GitHub
A JAX transport-economy learning environment for route planning, cargo flow, financing, and replayable agent benchmarks.
☆16Jul 2, 2026Updated 2 weeks ago
EvoML / EvoML
View on GitHub
Using Genetic Algorithms to aid Machine Learning
☆20Feb 20, 2018Updated 8 years ago
ranaumarnadeem / OpenTestability
View on GitHub
OpenTestability is an open-source tool for structural analysis of digital circuits, enabling computation of SCOAP metrics, Controllibilit…
☆20Jul 13, 2026Updated last week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
omarkamali / picomon
View on GitHub
Beautiful TUI dashboard for monitoring GPUs (AMD, NVIDIA, Apple Silicon)
☆16May 17, 2026Updated 2 months ago
wpcarro / galapagos
View on GitHub
Simple evolutionary solver in Rust
☆15Jan 2, 2025Updated last year
yjshen / spark-connector-test
View on GitHub
A tutorial on how to use pulsar-spark-connector
☆11Oct 13, 2020Updated 5 years ago
mikelhernaez / qvz
View on GitHub
A Lossy compressor for Quality Scores in Genomic Data
☆12Oct 4, 2016Updated 9 years ago
lograze / logfish
View on GitHub
Lightweight log explorer interface for ClickHouse inspired by Kibana
☆11Nov 16, 2020Updated 5 years ago
memkind / jemalloc
View on GitHub
☆15Apr 6, 2016Updated 10 years ago
NVIDIA / atex
View on GitHub
A TensorFlow Extension: GPU performance tools for TensorFlow.
☆26Jul 27, 2023Updated 2 years ago
project-faros / cluster-manager
View on GitHub
The meat and potatoes behind farosctl
☆12Feb 28, 2023Updated 3 years ago
palantir / assertj-automation
View on GitHub
Automatic code rewriting for AssertJ using error-prone and refaster
☆25Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
pijul-scm / sanakirja
View on GitHub
☆17Mar 13, 2017Updated 9 years ago
matrixorigin / matrixorigin.io.cn
View on GitHub
☆12Jul 6, 2026Updated 2 weeks ago
hfp / libxstream
View on GitHub
Library and accelerator backend
☆15Updated this week
claydotio / synapse
View on GitHub
A transparent service discovery framework for connecting an SOA
☆13Feb 15, 2015Updated 11 years ago
arquillian / arquillian-cube-q
View on GitHub
Fault injection and chaos testing all in one, sweet DSL.
☆12Jan 26, 2026Updated 5 months ago
nevgeniev / zig-maven-plugin
View on GitHub
zig language maven plugin
☆14May 5, 2026Updated 2 months ago
matrixorigin / matrixkv
View on GitHub
This is a distributed kv project to demonstrate how to use matrixcube
☆17Sep 8, 2022Updated 3 years ago