ekondis/gpumembench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ekondis/gpumembench)

ekondis / gpumembench

A GPU benchmark suite for assessing on-chip GPU memory bandwidth

☆113

Alternatives and similar repositories for gpumembench

Users that are interested in gpumembench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ekondis / mixbench
View on GitHub
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
☆463Updated this week
ekondis / gpuroofperf-toolkit
View on GitHub
A GPU performance prediction toolkit for CUDA programs
☆18Mar 25, 2019Updated 7 years ago
hma02 / cublasgemm-benchmark
View on GitHub
code for benchmarking GPU performance based on cublasSgemm and cublasHgemm
☆35May 20, 2022Updated 4 years ago
PAA-NCIC / PPoPP2017_artifact
View on GitHub
Third party assembler and GEMM library for NVIDIA Kepler GPU
☆86Oct 8, 2019Updated 6 years ago
gcoe-dresden / cuda-gpu-tlb
View on GitHub
TLB Benchmarks
☆35Sep 11, 2017Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
shen203 / GPU_Microbenchmark
View on GitHub
☆25Jun 24, 2022Updated 4 years ago
gevtushenko / cuda_benchmark
View on GitHub
A library to benchmark CUDA code, similar to google benchmark.
☆30Apr 18, 2021Updated 5 years ago
daadaada / gas
View on GitHub
☆49Dec 11, 2020Updated 5 years ago
sjfeng1999 / gpu-arch-microbenchmark
View on GitHub
Dissecting NVIDIA GPU Architecture
☆126Jul 11, 2022Updated 4 years ago
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,074Jan 3, 2023Updated 3 years ago
hyqneuron / asfermi
View on GitHub
assembler for NVIDIA FERMI. Imported from Google Code
☆77Mar 22, 2015Updated 11 years ago
canonizer / halloc
View on GitHub
A fast and highly scalable GPU dynamic memory allocator
☆111Mar 11, 2015Updated 11 years ago
apc-llc / nvcc-llvm-ir
View on GitHub
Enabling on-the-fly manipulations with LLVM IR code of CUDA sources
☆124Apr 18, 2025Updated last year
krrishnarraj / clpeak
View on GitHub
A synthetic micro-benchmark that measures peak compute, bandwidth, and matrix throughput of GPUs and CPUs
☆506Jul 21, 2026Updated last week
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
sunlex0717 / DissectingTensorCores
View on GitHub
☆115Apr 19, 2024Updated 2 years ago
srvm / cupti_profiler
View on GitHub
CUPTI GPU Profiler
☆39Feb 26, 2019Updated 7 years ago
ParCoreLab / CPU-Free-model
View on GitHub
Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…
☆21Apr 25, 2024Updated 2 years ago
vortexgpgpu / NVPTX-SPIRV-Translator
View on GitHub
The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.
☆45Oct 25, 2021Updated 4 years ago
xnd-project / cuda-benchmarks
View on GitHub
Collection of CUDA benchmarks, with a focus on unified vs. explicit memory management.
☆21Oct 15, 2019Updated 6 years ago
RRZE-HPC / gpu-benches
View on GitHub
collection of benchmarks to measure basic GPU capabilities
☆531Oct 24, 2025Updated 9 months ago
monotone-RK / FACE
View on GitHub
FACE: Fast and Customizable Sorting Accelerator
☆11Sep 6, 2016Updated 9 years ago
yalue / cuda_scheduling_examiner_mirror
View on GitHub
A tool for examining GPU scheduling behavior.
☆96Aug 17, 2024Updated last year
aoli-al / HFuse
View on GitHub
Horizontal Fusion
☆24Jan 7, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
UofT-EcoSystem / Tempo
View on GitHub
Memory footprint reduction for transformer models
☆11Jan 24, 2023Updated 3 years ago
pyxis-roc / ptxparser
View on GitHub
A parser for PTX 6.5
☆13Jun 19, 2023Updated 3 years ago
daadaada / turingas
View on GitHub
Assembler for NVIDIA Volta and Turing GPUs
☆247Jan 13, 2022Updated 4 years ago
passlab / CUDAMicroBench
View on GitHub
☆53Jun 24, 2025Updated last year
vetter / shoc
View on GitHub
The SHOC Benchmark Suite
☆262Oct 6, 2025Updated 9 months ago
karthikeyann / cuda-calculator
View on GitHub
HTML/JS port of CUDA Occupancy Calculator
☆17Nov 23, 2021Updated 4 years ago
NVIDIA / Fuser
View on GitHub
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆395May 31, 2026Updated last month
ariasanovsky / ptx-parser
View on GitHub
☆11Jun 9, 2023Updated 3 years ago
NVIDIA / cub
View on GitHub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,841Oct 9, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
XiuYuLi / deepcore_source_code
View on GitHub
Subpart source code of of deepcore v0.7
☆27Jun 28, 2020Updated 6 years ago
andersy005 / tvm-in-action
View on GitHub
TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together
☆65May 22, 2018Updated 8 years ago
clyfish / gcn-scrypt
View on GitHub
Scrypt opencl kernel written in AMD GCN ISA assembly language
☆20Oct 9, 2014Updated 11 years ago
NVIDIA / nvbench
View on GitHub
CUDA Kernel Benchmarking Library
☆913Updated this week
openai / openai-gemm
View on GitHub
Open single and half precision gemm implementations
☆396Apr 2, 2023Updated 3 years ago
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
Nek5000 / parRSB
View on GitHub
parallel graph partitioning using recursive spectral bisection (RSB)
☆23Jun 10, 2025Updated last year