twest820 / AVX-512Links
AVX-512 documentation beyond what Intel provides
☆49Updated last year
Alternatives and similar repositories for AVX-512
Users that are interested in AVX-512 are comparing it to the libraries listed below
Sorting:
- InstLatX64_Demo☆43Updated last week
- ROB size testing utility☆151Updated 3 years ago
- A small library and kernel module for easy access to x86 performance monitor counters under Linux.☆97Updated last year
- ☆57Updated this week
- A minimal (really) out-of-tree MLIR example☆44Updated 2 weeks ago
- The new home for CnC Tests and Framework Libaries☆57Updated 5 months ago
- uops.info Code Analyzer☆270Updated last year
- A description of Minotaur can be found in https://arxiv.org/abs/2306.00229.☆105Updated 9 months ago
- Instruction latency & throughput profiler for AArch64☆34Updated last year
- ☆56Updated 8 months ago
- Trying to figure various CPU things out☆78Updated last year
- Utilities to measure read access times of caches, memory, and hardware prefetches for simple and fused operations☆83Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆52Updated 2 months ago
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆138Updated 2 months ago
- A fast implementation of log() and exp()☆53Updated 2 years ago
- ☆29Updated 2 years ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆214Updated 7 months ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆119Updated 2 years ago
- Create man pages from information used by Intel Intrinsics Guide and optionally uops.info☆45Updated 5 months ago
- Test if AVX vector loads and stores are atomic☆30Updated 4 years ago
- Open Source Architecture Code Analyzer☆322Updated 3 weeks ago
- Reviving the old comp-arch.net wiki?☆18Updated last year
- A selection of ANSI C benchmarks and programs useful as benchmarks☆85Updated 9 months ago
- CPU Ultimate Latency Test.☆110Updated last week
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆63Updated 7 months ago
- Collection of synchronization micro-benchmarks and traces from infrastructure applications☆41Updated 3 weeks ago
- Record "perf" performance metrics for individual functions/regions of an ELF binary.☆80Updated last year
- RV: A Unified Region Vectorizer for LLVM☆108Updated last week
- ☆30Updated last year
- Tools and Reference Code for Intel Optimizations (eg Large Pages)☆143Updated 8 months ago