ROCm/iris

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ROCm/iris)

ROCm / iris

AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming

☆193

Alternatives and similar repositories for iris

Users that are interested in iris are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ROCm / mori
View on GitHub
Modular RDMA Interface
☆151Updated this week
ROCm / tritonBLAS
View on GitHub
A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆65Jun 13, 2026Updated last month
ROCm / FlyDSL
View on GitHub
FlyDSL is the Python front‑end of the project: Flexible LaYout DSL.
☆237Updated this week
ROCm / aiter
View on GitHub
AI Tensor Engine for ROCm
☆497Updated this week
ROCm / ATOM
View on GitHub
AiTer Optimized Model
☆141Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,494Updated this week
AMDResearch / intelliperf
View on GitHub
Automated bottleneck detection and solution orchestration
☆23Feb 24, 2026Updated 4 months ago
facebookexperimental / triton
View on GitHub
Github mirror of trition-lang/triton repo.
☆178Updated this week
ROCm / rocSHMEM
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆146Updated this week
yifuwang / symm-mem-recipes
View on GitHub
☆170Dec 27, 2024Updated last year
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆173Jun 16, 2026Updated last month
HazyResearch / HipKittens
View on GitHub
Fast and Furious AMD Kernels
☆444Jul 10, 2026Updated last week
foundation-model-stack / vllm-triton-backend
View on GitHub
A Triton-only attention backend for vLLM
☆27Updated this week
carlushuang / gcnasm
View on GitHub
amdgpu example code in hip/asm
☆66Jul 9, 2026Updated last week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ROCm / TransformerEngine
View on GitHub
☆72Updated this week
infinigence / FlashOverlap
View on GitHub
A lightweight design for computation-communication overlap.
☆242Jan 20, 2026Updated 6 months ago
ROCm / aotriton
View on GitHub
Ahead of Time (AOT) Triton Math Library
☆100Jul 13, 2026Updated last week
iree-org / wave
View on GitHub
Wave: Python Domain-Specific Language for High Performance Machine Learning
☆58Jun 29, 2026Updated 3 weeks ago
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated last year
perplexityai / pplx-kernels
View on GitHub
Perplexity GPU Kernels
☆591Nov 7, 2025Updated 8 months ago
meta-pytorch / torchcomms
View on GitHub
torchcomms: a modern PyTorch communications API
☆377Updated this week
uccl-project / mKernel
View on GitHub
mKernel: fast multi-node, multi-GPU fused kernels
☆251Jun 21, 2026Updated 3 weeks ago
flashinfer-ai / cutlass-viz
View on GitHub
☆65Apr 26, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
AMD-AGI / Primus
View on GitHub
A flexible and high-performance training framework designed for large-scale foundation model training on AMD GPUs
☆107Updated this week
pytorch / helion
View on GitHub
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆910Updated this week
ROCm / composable_kernel
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
☆538Updated this week
ROCm / amd_matrix_instruction_calculator
View on GitHub
A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators
☆139Apr 10, 2026Updated 3 months ago
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated 11 months ago
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
IBM / triton-dejavu
View on GitHub
Framework to reduce autotune overhead to zero for well known deployments.
☆101Sep 19, 2025Updated 10 months ago
NVIDIA / nvshmem
View on GitHub
NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…
☆560Updated this week
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,063Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mirage-project / mirage
View on GitHub
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
☆2,376Updated this week
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 2 weeks ago
dropbox / gemlite
View on GitHub
Fast low-bit matmul kernels in Triton
☆477Updated this week
tile-ai / tilelang-benchmark
View on GitHub
☆22Jun 10, 2026Updated last month
apache / tvm-ffi
View on GitHub
Open ABI and FFI for Machine Learning Systems
☆434Updated this week
ROCm / DeepEP
View on GitHub
☆15Jun 30, 2026Updated 2 weeks ago
AMDResearch / intellikit
View on GitHub
IntelliKit is a collection of intelligent tools designed to make GPU kernel development, profiling, and validation accessible to LLMs and…
☆25Updated this week