AMDResearch/intelliperf

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AMDResearch/intelliperf)

AMDResearch / intelliperf

Automated bottleneck detection and solution orchestration

☆23

Alternatives and similar repositories for intelliperf

Users that are interested in intelliperf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AMDResearch / intellikit
View on GitHub
IntelliKit is a collection of intelligent tools designed to make GPU kernel development, profiling, and validation accessible to LLMs and…
☆27Updated this week
ROCm / iris
View on GitHub
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
☆193Updated this week
ROCm / omnistat
View on GitHub
Scale-out system monitoring
☆25Updated this week
AMD-AGI / Magpie
View on GitHub
A lightweight, general-purpose framework for evaluating GPU kernel and benchmark.
☆56Updated this week
ypapadop-amd / ggml
View on GitHub
Tensor library for machine learning
☆34Updated this week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ROCm / tritonBLAS
View on GitHub
A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆65Updated this week
ROCm / hrx-system
View on GitHub
HRX: Hip Runtime Extended
☆18Updated this week
AMD-AGI / TraceLens
View on GitHub
Automating analysis from trace files
☆81Updated this week
SakanaAI / robust-kbench
View on GitHub
☆101Nov 22, 2025Updated 8 months ago
AMD-AGI / GEAK
View on GitHub
Generating Efficient AI-Centric Kernels
☆123Updated this week
ROCm / rocprof-compute-viewer
View on GitHub
☆62Jul 16, 2026Updated last week
ROCm / rocmProfileData
View on GitHub
☆30Jun 16, 2026Updated last month
ShujianQian / epic-eval
View on GitHub
☆10May 15, 2024Updated 2 years ago
ROCm / hipCollections
View on GitHub
Header-only library of GPU-accelerated, concurrent data structures.
☆12Jul 10, 2026Updated last week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
csc-training / hip
View on GitHub
☆14Oct 5, 2022Updated 3 years ago
IBM / triton-dejavu
View on GitHub
Framework to reduce autotune overhead to zero for well known deployments.
☆101Sep 19, 2025Updated 10 months ago
ROCm / rocBLAS-Examples
View on GitHub
Examples illustrating usage of the rocBLAS library
☆17Aug 12, 2024Updated last year
flagos-ai / DeepSeek-V4-FlagOS
View on GitHub
☆16Updated this week
danielvegamyhre / gemm
View on GitHub
☆19Mar 29, 2026Updated 3 months ago
wafer-ai / kernel-arena
View on GitHub
Public benchmark results from Kernel Arena, a leaderboard for LLM-generated AI accelerator kernels.
☆20Mar 11, 2026Updated 4 months ago
alibaba / redfuser
View on GitHub
☆21Mar 17, 2026Updated 4 months ago
olcf / hip-training-series
View on GitHub
Repository with examples and exercises for OLCF and AMD's HIP training series
☆17Oct 16, 2023Updated 2 years ago
AMD-AGI / Apex
View on GitHub
Agents, and RL environment, for optimizing GPU kernels on AMD ROCm using LLM agents. Benchmarks LLM serving workloads end-to-end, profile…
☆71Jul 16, 2026Updated last week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ROCm / rocprofiler-sdk
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆30May 28, 2026Updated last month
IaroslavElistratov / triton-autodiff
View on GitHub
☆19Nov 11, 2025Updated 8 months ago
ROCm / rocSHMEM
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆146Updated this week
ROCm / aiter
View on GitHub
AI Tensor Engine for ROCm
☆497Updated this week
ademeure / QuickRunCUDA
View on GitHub
☆20May 30, 2026Updated last month
ChandlerGuan / kperfir_artifact
View on GitHub
☆19May 9, 2025Updated last year
seb-v / amd_challenge_solutions
View on GitHub
☆19Jun 6, 2025Updated last year
hao-ai-lab / flash-attention-fp4
View on GitHub
NVFP4 Flash-Attention 4 on BlackWell
☆30Updated this week
NVIDIA / compute-eval
View on GitHub
Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…
☆143May 19, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
BishoySedra / HPC_Labs
View on GitHub
Parallel Programming with MPI
☆25May 11, 2026Updated 2 months ago
botong-zhou / SimpleBS
View on GitHub
A Simple Network Ping and Traceroute Tool
☆31Sep 24, 2015Updated 10 years ago
Deep-Learning-Profiling-Tools / triton-viz
View on GitHub
☆350Updated this week
Samsung / veles.simd
View on GitHub
Distributed machine learning platform
☆13Aug 20, 2015Updated 10 years ago
CRobeck / instrument-amdgpu-kernels
View on GitHub
LLVM/MLIR based compiler instrumentation of AMD GPU kernels
☆21Jul 13, 2025Updated last year
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
neuronsimulator / ringtest
View on GitHub
Ring network model test to demonstrate the use of CoreNEURON
☆11Jul 5, 2026Updated 2 weeks ago