twest820 / AVX-512Links
AVX-512 documentation beyond what Intel provides
☆57Updated last year
Alternatives and similar repositories for AVX-512
Users that are interested in AVX-512 are comparing it to the libraries listed below
Sorting:
- InstLatX64_Demo☆44Updated last week
- CPU Ultimate Latency Test.☆112Updated last month
- ☆58Updated last week
- A description of Minotaur can be found in https://arxiv.org/abs/2306.00229.☆112Updated last month
- A fast implementation of log() and exp()☆53Updated 2 years ago
- uops.info Code Analyzer☆293Updated last year
- Instruction latency & throughput profiler for AArch64☆39Updated 2 months ago
- x86-64, ARM, and RVV intrinsics viewer☆66Updated last month
- Open Source Architecture Code Analyzer☆334Updated 3 weeks ago
- ROB size testing utility☆158Updated 3 years ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆58Updated 2 years ago
- A small library and kernel module for easy access to x86 performance monitor counters under Linux.☆103Updated last year
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆121Updated 3 weeks ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆227Updated last year
- ☆58Updated last month
- A tool for running small microbenchmarks on recent Intel and AMD x86 CPUs.☆484Updated 4 months ago
- Trying to figure various CPU things out☆87Updated last year
- Create man pages from information used by Intel Intrinsics Guide and optionally uops.info☆45Updated 10 months ago
- A terminal viewer for x86 instruction/intrinsic information using Python 3 + curses☆128Updated 2 years ago
- ☆48Updated 2 months ago
- RV: A Unified Region Vectorizer for LLVM☆112Updated 4 months ago
- ☆31Updated last week
- Utilities to measure read access times of caches, memory, and hardware prefetches for simple and fused operations☆85Updated 2 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆113Updated last month
- Very low-overhead timer/counter interfaces for C on Intel 64 processors.☆136Updated last week
- Programatically obtain information about the pages backing a given memory region☆79Updated 4 years ago
- The future home for CnC Tests and Framework Libaries☆58Updated 3 months ago
- Support for ternary logic in SSE, XOP, AVX2 and x86 programs☆31Updated 9 months ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆124Updated 2 years ago
- Ocolos is the first online code layout optimization system for unmodified applications written in unmanaged languages.☆53Updated 4 months ago