kshitijl / avx2-examples
Short examples illustrating AVX2 intrinsics for simple tasks.
☆87Updated 11 months ago
Alternatives and similar repositories for avx2-examples:
Users that are interested in avx2-examples are comparing it to the libraries listed below
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated last year
- A 128 bit unsigned integer class for CUDA☆43Updated last month
- The Berkeley Container Library☆122Updated last year
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆113Updated 2 years ago
- A GPU accelerated error-bounded lossy compression for scientific data.☆72Updated this week
- Little OpenMP Library☆157Updated 2 years ago
- ☆130Updated last week
- GPU-Accelerated Lossless Data Compressors Survey☆113Updated 4 years ago
- Massively Parallel Huffman Decoding on GPUs☆47Updated 6 years ago
- TLB Benchmarks☆33Updated 7 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆134Updated last year
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆127Updated last year
- An implementation of HIP that works on CPUs, across OSes.☆115Updated 11 months ago
- UME::SIMD A library for explicit simd vectorization.☆91Updated 7 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆84Updated 10 months ago
- Header-only C++ library for low precision floating point type emulation.☆168Updated 5 years ago
- The Splash-3 benchmark suite☆42Updated last year
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆105Updated this week
- ☆52Updated 5 years ago
- Benchmark for measuring the performance of sparse and irregular memory access.☆76Updated last week
- ☆68Updated 4 years ago
- RV: A Unified Region Vectorizer for LLVM☆107Updated 3 weeks ago
- ☆56Updated this week
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆80Updated 5 years ago
- tools to create performance and roofline plots from measured data☆58Updated 10 years ago
- CUDA kernel author's tools☆110Updated 2 years ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 3 years ago
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆17Updated 9 months ago