lightsighter / Weft
A Sound and Complete Verification Tool for Warp-Specialized GPU Kernels
☆18Updated 9 years ago
Alternatives and similar repositories for Weft:
Users that are interested in Weft are comparing it to the libraries listed below
- LonestarGPU: Irregular algorithms parallelized for GPUs☆33Updated 5 years ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- Loop Kernel Analysis and Performance Modeling Toolkit☆91Updated 5 months ago
- A task benchmark☆41Updated 6 months ago
- tools to create performance and roofline plots from measured data☆58Updated 10 years ago
- A unified framework across multiple programming platforms☆36Updated 8 months ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆70Updated this week
- ☆52Updated 5 years ago
- Chai☆42Updated last year
- Compute applications.☆24Updated 5 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆80Updated 5 years ago
- Autonomic Performance Environment for eXascale (APEX)☆43Updated last week
- ☆51Updated 5 years ago
- Library to plot integer sets and maps☆49Updated 8 years ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆21Updated 2 years ago
- A Benchmark Suite for Heterogeneous System Computation☆53Updated 3 months ago
- ☆40Updated this week
- Home of ALP/GraphBLAS and ALP/Pregel, featuring shared- and distributed-memory auto-parallelisation of linear algebraic and vertex-centri…☆25Updated 2 weeks ago
- TLB Benchmarks☆33Updated 7 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆50Updated this week
- Evaluating different memory managers for dynamic GPU memory☆24Updated 4 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆28Updated 5 months ago
- Codeplay project for contributions to the LLVM SYCL implementation☆30Updated 4 years ago
- Flexible GPGPU instrumentation☆86Updated 5 years ago
- A GPU algorithm for sparse matrix-matrix multiplication☆67Updated 4 years ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆46Updated 10 years ago
- A framework that helps implementing swizzle GPU kernels☆42Updated 4 years ago
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆109Updated 2 years ago
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 3 years ago