A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆57Mar 20, 2025Updated last year
Alternatives and similar repositories for PTXprofiler
Users that are interested in PTXprofiler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- ☆10May 12, 2022Updated 4 years ago
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- outline and links for PLDI 2022 tutorial☆17Jun 13, 2022Updated 3 years ago
- Rebuild YatSenOS On RISC-V 64.☆23Jan 6, 2022Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- study of cutlass☆22Nov 10, 2024Updated last year
- OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents …☆474May 20, 2026Updated 3 weeks ago
- PTX-EMU is a simple emulator for CUDA program.☆39Apr 25, 2025Updated last year
- CUDA grammar for tree-sitter☆36Nov 23, 2025Updated 6 months ago
- A Throughput-Optimized Pipeline Parallel Inference System for Large Language Models☆50Dec 24, 2025Updated 5 months ago
- simple port of hpl-2.0 to use NVIDIA GPU accelation with CUBLAS☆29May 13, 2013Updated 13 years ago
- Speeding Up Your Python Codes 1000x☆12Apr 2, 2025Updated last year
- ☆18Apr 8, 2022Updated 4 years ago
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆93Apr 14, 2026Updated last month
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆45Oct 25, 2021Updated 4 years ago
- Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner☆21Sep 12, 2025Updated 8 months ago
- Unit benchmarks of CUDA event APIs.☆17Apr 23, 2024Updated 2 years ago
- A native GPU bytecode compiler for constructive solid geometry☆25May 29, 2019Updated 7 years ago
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆16Mar 19, 2023Updated 3 years ago
- A GPU FP32 computation method with Tensor Cores.☆27Dec 8, 2025Updated 6 months ago
- Rodinia benchmark☆23Jul 5, 2024Updated last year
- Set sail on anime computer graphics!☆15Mar 1, 2023Updated 3 years ago
- ☆23Dec 18, 2025Updated 5 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Generate Linux Perf event tables for Apple Silicon☆17Dec 16, 2025Updated 5 months ago
- ☆24Jun 12, 2023Updated 2 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated last year
- ☆40Dec 14, 2025Updated 5 months ago
- ☆42Apr 3, 2022Updated 4 years ago
- Generate publication-quality figures using python☆23Jun 5, 2016Updated 10 years ago
- ☆55Nov 21, 2019Updated 6 years ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆85Mar 20, 2023Updated 3 years ago
- High Performance Computing Conjugate Gradients: The original Mantevo miniapp☆19Jan 29, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- a simple API to use CUPTI☆10Aug 19, 2025Updated 9 months ago
- A pure-Python implementation of the Nvidia CuTe layout algebra intended to be approachable and easy to learn.☆182May 15, 2026Updated 3 weeks ago
- Yet another toy CPU.☆92Dec 10, 2023Updated 2 years ago
- eBPF tool to collect BOLT profile☆14Apr 9, 2026Updated 2 months ago
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago
- Personal Notes for Learning HPC & Parallel Computation [NO LONGER ADDING NEW CONTENT]☆78Jul 29, 2022Updated 3 years ago
- A tool for cross-checking Verilog compilers☆15Apr 16, 2025Updated last year