ProjectPhysX / PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆43Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for PTXprofiler
- SYCL Conformance Tests☆62Updated this week
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆99Updated this week
- Reusable software components for ROCm developers☆79Updated this week
- rocWMMA☆91Updated this week
- SYCL Benchmark Suite☆56Updated 2 months ago
- An extension library of WMMA API (Tensor Core API)☆84Updated 4 months ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- SYCL Reference Manual☆26Updated 6 months ago
- Advanced Profiling and Analytics for AMD Hardware☆135Updated this week
- 🎃 GPU load-balancing library for regular and irregular computations.☆57Updated 5 months ago
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆75Updated last week
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆99Updated 7 years ago
- AMD lab notes with code examples to demonstrate use of AMD GPUs☆91Updated 4 months ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆80Updated this week
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆67Updated 10 months ago
- ROCm SPARSE marshalling library☆69Updated this week
- ☆54Updated 3 weeks ago
- ☆128Updated this week
- hipFFT is a FFT marshalling library.☆54Updated this week
- Next generation SPARSE implementation for ROCm platform☆116Updated this week
- Examples for using SYCL on CUDA☆60Updated 2 weeks ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- AMD’s C++ library for accelerating tensor primitives☆35Updated this week
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆103Updated 3 months ago
- development repository for the open earth compiler☆77Updated 3 years ago
- Next generation LAPACK implementation for ROCm platform☆94Updated this week
- ☆50Updated 4 years ago
- GPU Performance Advisor☆63Updated 2 years ago
- Next generation library for iterative sparse solvers for ROCm platform☆76Updated this week
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆31Updated 3 years ago