ROCm/aiter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ROCm/aiter)

ROCm / aiter

AI Tensor Engine for ROCm

☆497

Alternatives and similar repositories for aiter

Users that are interested in aiter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ROCm / ATOM
View on GitHub
AiTer Optimized Model
☆141Updated this week
ROCm / FlyDSL
View on GitHub
FlyDSL is the Python front‑end of the project: Flexible LaYout DSL.
☆237Updated this week
ROCm / mori
View on GitHub
Modular RDMA Interface
☆151Updated this week
ROCm / composable_kernel
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
☆538Updated this week
carlushuang / gcnasm
View on GitHub
amdgpu example code in hip/asm
☆66Jul 9, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
HazyResearch / HipKittens
View on GitHub
Fast and Furious AMD Kernels
☆444Jul 10, 2026Updated last week
ROCm / iris
View on GitHub
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
☆193Updated this week
ROCm / rocmProfileData
View on GitHub
☆30Jun 16, 2026Updated last month
ROCm / rocm-libraries
View on GitHub
super repo for rocm libraries
☆389Updated this week
ROCm / TransformerEngine
View on GitHub
☆72Updated this week
AMD-AGI / Primus
View on GitHub
A flexible and high-performance training framework designed for large-scale foundation model training on AMD GPUs
☆107Updated this week
ROCm / rocm-systems
View on GitHub
super repo for rocm systems projects
☆438Updated this week
ROCm / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆43Updated this week
ROCm / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆234Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ROCm / hipBLASLt
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆114Updated this week
ROCm / rocprof-compute-viewer
View on GitHub
☆61Updated this week
ROCm / amd_matrix_instruction_calculator
View on GitHub
A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators
☆139Apr 10, 2026Updated 3 months ago
ROCm / triton
View on GitHub
Development repository for the Triton language and compiler
☆146Updated this week
AMD-AGI / GEAK
View on GitHub
Generating Efficient AI-Centric Kernels
☆121Updated this week
ROCm / MAD
View on GitHub
MAD (Model Automation and Dashboarding)
☆38Jul 7, 2026Updated 2 weeks ago
ROCm / aotriton
View on GitHub
Ahead of Time (AOT) Triton Math Library
☆100Jul 13, 2026Updated last week
ROCm / rocSHMEM
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆146Updated this week
mk1-project / quickreduce
View on GitHub
QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.
☆38Aug 29, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AMD-AGI / Primus-Turbo
View on GitHub
A high-performance acceleration library dedicated to large-scale model training on AMD GPUs
☆67Updated this week
ROCm / rocprofiler-compute
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆165May 28, 2026Updated last month
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,988Updated this week
ROCm / TheRock
View on GitHub
The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm
☆1,157Updated this week
ROCm / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆122Updated this week
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,494Updated this week
ROCm / rccl-tests
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆92Jul 14, 2026Updated last week
ROCm / tritonBLAS
View on GitHub
A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆65Jun 13, 2026Updated last month
SemiAnalysisAI / InferenceX
View on GitHub
Open Source Continuous Inference Benchmark Research Platform — Kimi K2.7-Code, MiniMax M3, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B2…
☆1,264Updated this week
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ROCm / rocWMMA
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆140Jul 13, 2026Updated last week
powderluv / vllm-docs
View on GitHub
Documentation for vLLM Dev Channel releases
☆10Dec 5, 2024Updated last year
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,104Updated this week
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆6,674Updated this week
ROCm / AMDMIGraphX
View on GitHub
AMD's graph optimization engine.
☆318Updated this week
seb-v / fp32_sgemm_amd
View on GitHub
Super fast FP32 matrix multiplication on RDNA3
☆92Mar 30, 2025Updated last year
mirage-project / mirage
View on GitHub
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
☆2,376Updated this week