twest820 / AVX-512Links
AVX-512 documentation beyond what Intel provides
☆65Updated 2 years ago
Alternatives and similar repositories for AVX-512
Users that are interested in AVX-512 are comparing it to the libraries listed below
Sorting:
- CPU Ultimate Latency Test.☆117Updated 4 months ago
- A description of Minotaur can be found in https://arxiv.org/abs/2306.00229.☆124Updated 2 weeks ago
- InstLatX64_Demo☆45Updated 3 months ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆59Updated 3 years ago
- ☆59Updated last month
- A small library and kernel module for easy access to x86 performance monitor counters under Linux.☆106Updated last year
- Open Source Architecture Code Analyzer☆347Updated last week
- ROB size testing utility☆158Updated 4 years ago
- uops.info Code Analyzer☆322Updated 2 years ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆230Updated last year
- AOCL-LibM☆126Updated this week
- Reworking of Agner Fog's performance test programs for Linux☆116Updated 2 months ago
- Very low-overhead timer/counter interfaces for C on Intel 64 processors.☆140Updated 2 months ago
- ☆57Updated 4 months ago
- C++ template library for floating point operations☆37Updated 3 weeks ago
- Create man pages from information used by Intel Intrinsics Guide and optionally uops.info☆46Updated last year
- A terminal viewer for x86 instruction/intrinsic information using Python 3 + curses☆128Updated 3 years ago
- Batched random number generation☆18Updated 3 months ago
- A collection of (public) notes on assorted topics☆79Updated 4 months ago
- The future home for CnC Tests and Framework Libaries☆57Updated 6 months ago
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆127Updated this week
- Wyrm is a GCC GIMPLE to LLVM IR transpiler☆57Updated last year
- A fast implementation of log() and exp()☆56Updated 3 years ago
- Clang supporting syntax plugins☆21Updated 3 years ago
- Copy of instlatx64.atw.hu☆232Updated 3 weeks ago
- Instruction latency & throughput profiler for AArch64☆42Updated 5 months ago
- Testing memory-level parallelism☆82Updated last year
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆126Updated 3 years ago
- RV: A Unified Region Vectorizer for LLVM☆113Updated 7 months ago
- Fast CRC32 implementations☆114Updated last month