mohamed / roofline
A simple script to plot the Roofline model for given HW platforms and applications
☆9Updated 7 months ago
Alternatives and similar repositories for roofline:
Users that are interested in roofline are comparing it to the libraries listed below
- ☆24Updated 5 years ago
- ☆17Updated 2 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆32Updated 4 years ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- GPU Performance Advisor☆64Updated 2 years ago
- ☆43Updated 4 years ago
- C++/MPI proxies for distributed training of deep neural networks.☆13Updated 2 years ago
- ngAP's artifact for ASPLOS'24☆21Updated 2 months ago
- GVProf: A Value Profiler for GPU-based Clusters☆49Updated last year
- Thinking is hard - automate it☆19Updated 2 years ago
- Evaluating different memory managers for dynamic GPU memory☆25Updated 4 years ago
- CUPTI GPU Profiler☆37Updated 6 years ago
- Modified version of PyTorch able to work with changes to GPGPU-Sim☆51Updated 2 years ago
- A GPU FP32 computation method with Tensor Cores.☆20Updated 2 years ago
- ☆12Updated 4 years ago
- ☆33Updated 2 years ago
- Emulating DMA Engines on GPUs for Performance and Portability☆39Updated 9 years ago
- Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.☆24Updated 4 years ago
- DietCode Code Release☆62Updated 2 years ago
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆21Updated last month
- study of Ampere' Sparse Matmul☆17Updated 4 years ago
- ☆13Updated 3 years ago
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆64Updated 6 years ago
- Artifacts of EVT ASPLOS'24☆23Updated last year
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆28Updated 6 months ago
- MAFIA: Multiple Application Framework for GPU architectures☆25Updated 3 years ago
- Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support…☆44Updated 6 years ago
- ☆69Updated 4 years ago
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆18Updated last year
- A Benchmark Suite for Heterogeneous System Computation☆53Updated last month