VictorRodriguez / AVX-SGLinks
Advanced Vector Extensions (AVX) basic tutorial
☆37Updated 4 years ago
Alternatives and similar repositories for AVX-SG
Users that are interested in AVX-SG are comparing it to the libraries listed below
Sorting:
- TLB Benchmarks☆34Updated 7 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆138Updated last year
- Short examples illustrating AVX2 intrinsics for simple tasks.☆95Updated last year
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆23Updated last year
- Unit benchmarks of CUDA event APIs.☆17Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- Chai☆44Updated last year
- ❤️ CUDA/C++ GPU graph analytics simplified.☆31Updated 2 years ago
- ☆91Updated 8 years ago
- Sparse matrix computation library for GPU☆56Updated 4 years ago
- Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.☆38Updated 3 years ago
- ☆44Updated 4 years ago
- The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github…☆32Updated last month
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 7 years ago
- 🎃 GPU load-balancing library for regular and irregular computations.☆62Updated last year
- Thinking is hard - automate it☆19Updated 2 years ago
- Haystack is an analytical cache model that given a program computes the number of cache misses.☆46Updated 5 years ago
- Stencil Probe - a stencil microbenchmark☆30Updated 12 years ago
- Parallelized and vectorized SpMV on Intel Xeon Phi (Knights Landing, AVX512, KNL)☆24Updated last year
- Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels☆13Updated 9 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆81Updated 5 years ago
- CUPTI GPU Profiler☆38Updated 6 years ago
- HCC Sample Applications☆13Updated 8 years ago
- A low-overhead tool to periodically collect system-wide hardware performance counters on Intel64 systems.☆32Updated 2 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆30Updated 9 months ago
- ☆17Updated 3 years ago
- cuASR: CUDA Algebra for Semirings☆36Updated 2 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆33Updated 2 months ago
- gossip: Efficient Communication Primitives for Multi-GPU Systems☆59Updated 2 years ago
- Fast AVX512 (AVX-512) quicksort + bitonic sort.☆28Updated 3 years ago