jeewhanchoi / a-roofline-model-of-energy-ubenchmarks
Automatically exported from code.google.com/p/a-roofline-model-of-energy-ubenchmarks
☆11Updated 4 years ago
Alternatives and similar repositories for a-roofline-model-of-energy-ubenchmarks:
Users that are interested in a-roofline-model-of-energy-ubenchmarks are comparing it to the libraries listed below
- Chai☆43Updated last year
- Flexible GPGPU instrumentation☆86Updated 5 years ago
- Instanciate the Cache Aware Roofline Model on single socket and multisocket systems.☆27Updated 6 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆81Updated 5 years ago
- ☆59Updated 5 months ago
- ☆43Updated 4 years ago
- A Benchmark Suite for Heterogeneous System Computation☆53Updated last month
- TLB Benchmarks☆33Updated 7 years ago
- CSR-based SpMV on Heterogeneous Processors (Intel Broadwell, AMD Kaveri and nVidia Tegra K1)☆27Updated 9 years ago
- The SparseX sparse kernel optimization library☆40Updated 6 years ago
- A BarrierPoint implementation: Automatically select representative regions of parallel applications☆14Updated 8 years ago
- The SHOC Benchmark Suite☆251Updated 3 years ago
- A low-overhead tool to periodically collect system-wide hardware performance counters on Intel64 systems.☆33Updated 2 years ago
- Parallelized and vectorized SpMV on Intel Xeon Phi (Knights Landing, AVX512, KNL)☆25Updated last year
- Evaluating different memory managers for dynamic GPU memory☆25Updated 4 years ago
- tools to create performance and roofline plots from measured data☆58Updated 10 years ago
- Benchmark for measuring the performance of sparse and irregular memory access.☆76Updated last month
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 4 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆28Updated 6 months ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- Loop Kernel Analysis and Performance Modeling Toolkit☆92Updated last week
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)☆21Updated 5 years ago
- A domain-specific language and compiler for image processing☆76Updated 4 years ago
- Compute applications.☆24Updated 5 years ago
- ☆24Updated 5 years ago
- LonestarGPU: Irregular algorithms parallelized for GPUs☆34Updated 5 years ago
- Parallel Tensor Infrastructure (ParTI!)☆28Updated 4 years ago
- ☆53Updated 5 years ago
- Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.☆38Updated 3 years ago