Lin-Mao/DrGPUM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Lin-Mao/DrGPUM)

Lin-Mao / DrGPUM

A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.

☆36

Alternatives and similar repositories for DrGPUM

Users that are interested in DrGPUM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FindHao / drgpu
View on GitHub
A Top-Down Profiler for GPU Applications
☆23Feb 29, 2024Updated 2 years ago
AccelProf / AccelProf
View on GitHub
A modular program analysis tool framework for accelerators (NVIDIA, AMD, and DL workloads).
☆24Jul 5, 2026Updated 2 weeks ago
MoZeWei / moTuner
View on GitHub
☆10May 12, 2022Updated 4 years ago
Jokeren / GPA
View on GitHub
GPU Performance Advisor
☆66Jul 25, 2022Updated 3 years ago
GVProf / GVProf
View on GitHub
GVProf: A Value Profiler for GPU-based Clusters
☆54Mar 24, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Jokeren / Awesome-GPU
View on GitHub
Awesome resources for GPUs
☆635Mar 10, 2026Updated 4 months ago
Ryu1845 / hyena-jax
View on GitHub
Implementation of Hyena Hierarchy in JAX
☆10Apr 30, 2023Updated 3 years ago
ProjectPhysX / PTXprofiler
View on GitHub
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆59Mar 20, 2025Updated last year
daniel-geon-park / triton_bwd
View on GitHub
Automatic differentiation for Triton Kernels
☆29Aug 12, 2025Updated 11 months ago
getianao / ngAP
View on GitHub
ngAP's artifact for ASPLOS'24
☆25Jul 29, 2025Updated 11 months ago
concept-inversion / C-SAW
View on GitHub
A Framework for Graph Sampling and Random Walk on GPUs.
☆38Feb 3, 2025Updated last year
bacaldwell / scalable-monitoring
View on GitHub
Scripts for monitoring InfiniBand and storage devices
☆11Sep 4, 2015Updated 10 years ago
ROCm / rocm_bandwidth_test
View on GitHub
Bandwidth test for ROCm
☆86Updated this week
Xuhpclab / jxperf
View on GitHub
☆11Jan 4, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
escalab / SIMD2
View on GitHub
☆31Jun 15, 2022Updated 4 years ago
accel-sim / gpu-app-collection
View on GitHub
A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.
☆93Apr 14, 2026Updated 3 months ago
Mantevo / HPCCG
View on GitHub
High Performance Computing Conjugate Gradients: The original Mantevo miniapp
☆20Jan 29, 2024Updated 2 years ago
NVIDIA / compute-sanitizer-samples
View on GitHub
Samples demonstrating how to use the Compute Sanitizer Tools and Public API
☆99Nov 6, 2023Updated 2 years ago
xianweiz / Python.paper.figures
View on GitHub
Generate publication-quality figures using python
☆23Jun 5, 2016Updated 10 years ago
Qiong-WU / ARRR_code
View on GitHub
☆29Oct 22, 2020Updated 5 years ago
flashinfer-ai / debug-print
View on GitHub
Debug print operator for cudagraph debugging
☆18Aug 2, 2024Updated last year
Lysel / MNQ_LSTM
View on GitHub
Code for the paper: "T-shape data and probabilistic remaining useful life prediction for Li-ion batteries using multiple non-crossing qua…
☆10Aug 4, 2023Updated 2 years ago
llamajun / qwen.metal
View on GitHub
一个用Apple Metal实现的Llama和通义千问大模型本地推理
☆10Apr 26, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
escalab / RTSpMSpM
View on GitHub
☆25Apr 13, 2025Updated last year
pgera / efg
View on GitHub
GPU based Compressed Graph Traversal
☆16Jan 9, 2026Updated 6 months ago
malfet / llm_experiments
View on GitHub
☆13Jul 12, 2026Updated last week
Linestro / GRACE
View on GitHub
Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
☆19Mar 5, 2023Updated 3 years ago
chenllliang / ParetoMNMT
View on GitHub
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆17Sep 27, 2023Updated 2 years ago
jesonxiang / cpp_extension_pybind11
View on GitHub
A demo project demonstrating the performance improvement by cpp extension, which wrapped with pybind11.
☆10Nov 16, 2021Updated 4 years ago
image-rs / weezl
View on GitHub
LZW en- and decoding that goes weeeee!
☆34May 17, 2026Updated 2 months ago
desert0616 / gpma_demo
View on GitHub
Source code for the paper: Accelerating Dynamic Graph Analytics on GPUs
☆30Jun 19, 2023Updated 3 years ago
IronySuzumiya / NiuDianNao
View on GitHub
A simple cycle-accurate DaDianNao simulator
☆13Mar 27, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ROCm / roc-stdpar
View on GitHub
☆20Jan 17, 2024Updated 2 years ago
IntelligentSoftwareSystems / GaloisGPU
View on GitHub
LonestarGPU: Irregular algorithms parallelized for GPUs
☆38Nov 11, 2019Updated 6 years ago
helchr / perfMemPlus
View on GitHub
☆15Sep 28, 2020Updated 5 years ago
regehr / pldi22-llvm-tutorial
View on GitHub
outline and links for PLDI 2022 tutorial
☆17Jun 13, 2022Updated 4 years ago
ROCm / HCC-Example-Application
View on GitHub
HCC Sample Applications
☆13Jan 3, 2017Updated 9 years ago
Xuhpclab / DrCCTProf
View on GitHub
DrCCTProf is a fine-grained call path profiling framework for binaries running on ARM and X86 architectures.
☆123Oct 26, 2023Updated 2 years ago
salesforce / simplification
View on GitHub
☆23Jun 25, 2026Updated 3 weeks ago