A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
☆37May 30, 2026Updated last month
Alternatives and similar repositories for DrGPUM
Users that are interested in DrGPUM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Top-Down Profiler for GPU Applications☆23Feb 29, 2024Updated 2 years ago
- A modular program analysis tool framework for accelerators (NVIDIA, AMD, and DL workloads).☆24Apr 22, 2026Updated 2 months ago
- GPU Performance Advisor☆66Jul 25, 2022Updated 3 years ago
- ☆10May 12, 2022Updated 4 years ago
- GVProf: A Value Profiler for GPU-based Clusters☆54Mar 24, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Awesome resources for GPUs☆629Mar 10, 2026Updated 3 months ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆32Dec 21, 2024Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆58Mar 20, 2025Updated last year
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 11 months ago
- AI大模型角色扮演聊天系统 | 高拟人语音回复 | 支持单聊 & 多角色群聊 | 演绎《红楼梦》《金瓶梅》及自定义剧情 | 爱情、友情、兄妹情、父子情、暧昧恋爱、复杂三角关系 | 300+语音音色沉浸体验 | Android & Web | 商业化源码可控 | 提升角色互动…☆63Apr 12, 2026Updated 2 months ago
- Scripts for monitoring InfiniBand and storage devices☆11Sep 4, 2015Updated 10 years ago
- ☆11Jan 4, 2022Updated 4 years ago
- Bandwidth test for ROCm☆86Jun 16, 2026Updated 2 weeks ago
- A Framework for Graph Sampling and Random Walk on GPUs.☆38Feb 3, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆31Jun 15, 2022Updated 4 years ago
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆92Apr 14, 2026Updated 2 months ago
- Debug print operator for cudagraph debugging☆18Aug 2, 2024Updated last year
- Pluggable role definitions for AI coding agents — one command turns Claude Code / Cursor / OpenCode / Codex into a specialized profession…☆45Mar 28, 2026Updated 3 months ago
- Generate publication-quality figures using python☆23Jun 5, 2016Updated 10 years ago
- TideDesk 是一个面向内容运营与知识归档的自动化工作台,支持绑定多个 X 账号,抓取推荐、热点与搜索内容,完成去重归档、分类标签整理、AI 自动分析、周月报生成,以及一键分发到微信公众号、知乎、CSDN 等平台,帮助个人或团队把信息流沉淀为可管理、可复用、可发布的内容…☆51Mar 24, 2026Updated 3 months ago
- 一个用Apple Metal实现的Llama和通义千问大模型本地推理☆10Apr 26, 2024Updated 2 years ago
- GPU based Compressed Graph Traversal☆16Jan 9, 2026Updated 5 months ago
- ☆13May 11, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A demo project demonstrating the performance improvement by cpp extension, which wrapped with pybind11.☆10Nov 16, 2021Updated 4 years ago
- Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023☆17Sep 27, 2023Updated 2 years ago
- Audit npm, Yarn, and pnpm lockFiles as both an MCP server and a CLI tool.☆54May 22, 2026Updated last month
- ☆29Oct 22, 2020Updated 5 years ago
- LZW en- and decoding that goes weeeee!☆34May 17, 2026Updated last month
- A task benchmark☆46Apr 17, 2026Updated 2 months ago
- A simple cycle-accurate DaDianNao simulator☆13Mar 27, 2019Updated 7 years ago
- ☆20Jan 17, 2024Updated 2 years ago
- Code for the paper: "T-shape data and probabilistic remaining useful life prediction for Li-ion batteries using multiple non-crossing qua…☆10Aug 4, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- DrCCTProf is a fine-grained call path profiling framework for binaries running on ARM and X86 architectures.☆123Oct 26, 2023Updated 2 years ago
- HCC Sample Applications☆13Jan 3, 2017Updated 9 years ago
- LonestarGPU: Irregular algorithms parallelized for GPUs☆38Nov 11, 2019Updated 6 years ago
- CPU and GPU tutorial examples☆13Apr 4, 2025Updated last year
- [FAST'25] ShiftLock: Mitigate One-sided RDMA Lock Contention via Handover.☆20Feb 11, 2025Updated last year
- ☆17Dec 9, 2022Updated 3 years ago
- ☆17Sep 27, 2022Updated 3 years ago