VictorRodriguez / AVX-SGLinks
Advanced Vector Extensions (AVX) basic tutorial
☆37Updated 4 years ago
Alternatives and similar repositories for AVX-SG
Users that are interested in AVX-SG are comparing it to the libraries listed below
Sorting:
- Example code for Intel AVX / AVX2 intrinsics.☆142Updated 2 years ago
- Short examples illustrating AVX2 intrinsics for simple tasks.☆98Updated last year
- Chai☆46Updated 3 weeks ago
- Flexible GPGPU instrumentation☆89Updated 6 years ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆138Updated 2 years ago
- ☆94Updated 8 years ago
- The SHOC Benchmark Suite☆259Updated 2 months ago
- tools to create performance and roofline plots from measured data☆60Updated 11 years ago
- ☆48Updated 5 years ago
- TLB Benchmarks☆34Updated 8 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆109Updated 8 years ago
- Tests and benchmarks for cudnn (and in the future, other nvidia libraries)☆53Updated 5 years ago
- A low-overhead tool to periodically collect system-wide hardware performance counters on Intel64 systems.☆32Updated 3 years ago
- Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.☆42Updated 4 years ago
- HPC Challenge Benchmark☆64Updated 2 months ago
- 🎃 GPU load-balancing library for regular and irregular computations.☆63Updated 3 months ago
- A 128 bit unsigned integer class for CUDA☆46Updated 11 months ago
- Stencil Probe - a stencil microbenchmark☆30Updated 12 years ago
- Utilities to measure read access times of caches, memory, and hardware prefetches for simple and fused operations☆85Updated 2 years ago
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆26Updated 2 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 8 years ago
- GPUfs - File system support for NVIDIA GPUs☆98Updated 7 years ago
- CUPTI GPU Profiler☆40Updated 6 years ago
- GPUDirect Async support for IB Verbs☆133Updated 3 years ago
- Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels☆14Updated 10 years ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆229Updated last year
- An attempt at achieving the theoretical best memory bandwidth of my machine.☆53Updated 12 years ago
- ☆63Updated last year
- Tools and extensions for CUDA profiling☆64Updated 5 years ago
- ☆288Updated 2 months ago