A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆57Mar 20, 2025Updated last year
Alternatives and similar repositories for PTXprofiler
Users that are interested in PTXprofiler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- ☆11Jun 9, 2023Updated 2 years ago
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- Rebuild YatSenOS On RISC-V 64.☆23Jan 6, 2022Updated 4 years ago
- study of cutlass☆22Nov 10, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 9 months ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆36Oct 13, 2024Updated last year
- A small OpenCL benchmark program to measure peak GPU/CPU performance.☆296Updated this week
- PTX-EMU is a simple emulator for CUDA program.☆38Apr 25, 2025Updated last year
- CUDA grammar for tree-sitter☆35Nov 23, 2025Updated 5 months ago
- A Throughput-Optimized Pipeline Parallel Inference System for Large Language Models☆50Dec 24, 2025Updated 4 months ago
- Speeding Up Your Python Codes 1000x☆12Apr 2, 2025Updated last year
- ☆18Apr 8, 2022Updated 4 years ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆45Oct 25, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner☆21Sep 12, 2025Updated 8 months ago
- RISC-V-based many-core neuromorphic architecture☆16Updated this week
- Unit benchmarks of CUDA event APIs.☆17Apr 23, 2024Updated 2 years ago
- A native GPU bytecode compiler for constructive solid geometry☆25May 29, 2019Updated 6 years ago
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆16Mar 19, 2023Updated 3 years ago
- Cheap: customized heaps for improved application performance.☆28Oct 11, 2022Updated 3 years ago
- A GPU FP32 computation method with Tensor Cores.☆27Dec 8, 2025Updated 5 months ago
- Rodinia benchmark☆23Jul 5, 2024Updated last year
- ☆23Dec 18, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Generate Linux Perf event tables for Apple Silicon☆17Dec 16, 2025Updated 5 months ago
- rv6 is a kernel & operating system written entirely in rust.☆11Nov 7, 2019Updated 6 years ago
- A GPU benchmark suite for autotuners☆19Feb 20, 2024Updated 2 years ago
- ☆24Jun 12, 2023Updated 2 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated last year
- ☆41Apr 3, 2022Updated 4 years ago
- A distributed key value database based on LSM Tree storage☆15Aug 24, 2022Updated 3 years ago
- Generate publication-quality figures using python☆23Jun 5, 2016Updated 9 years ago
- ☆55Nov 21, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- High Performance Computing Conjugate Gradients: The original Mantevo miniapp☆19Jan 29, 2024Updated 2 years ago
- a simple API to use CUPTI☆10Aug 19, 2025Updated 9 months ago
- A pure-Python implementation of the Nvidia CuTe layout algebra intended to be approachable and easy to learn.☆179May 7, 2026Updated 2 weeks ago
- Yet another toy CPU.☆92Dec 10, 2023Updated 2 years ago
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 9 months ago
- eBPF tool to collect BOLT profile☆14Apr 9, 2026Updated last month
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago