zingaburga / alderlake_avx512
Info on enabling AVX-512 on Alder Lake
☆42Updated 2 years ago
Alternatives and similar repositories for alderlake_avx512:
Users that are interested in alderlake_avx512 are comparing it to the libraries listed below
- ☆29Updated 9 months ago
- InstLatX64_Demo☆42Updated last month
- Test if AVX vector loads and stores are atomic☆29Updated 4 years ago
- A Metal implementation similar to the official Metal C++ API☆48Updated last year
- Derived from Nemes' gpuperftests☆30Updated 8 months ago
- Microbenchmarking experiments on Zen 2 machines☆16Updated 2 years ago
- ZP7: Zach's Peppy Parallel-Prefix-Popcountin' PEXT/PDEP Polyfill☆51Updated 7 months ago
- A software library of lossless data compression methods tuned and optimized for AMD “Zen”-based CPUs☆27Updated last week
- Fork of LLVM with support for downgrading bitcode.☆19Updated 3 months ago
- Support for ternary logic in SSE, XOP, AVX2 and x86 programs☆31Updated 2 months ago
- OpenCL/SPIR-V implementation of HIP☆104Updated 2 years ago
- Utilities for accessing AMD's Machine-Readable GPU ISA Specifications.☆30Updated 3 weeks ago
- A Benchmark Toolkit for Assembly Instructions Using the LLVM JIT☆16Updated 4 years ago
- IMPORTANT NOTICE: This implementation is long outdated. The new libwfv will be released soon. Whole-Function Vectorization is an algorith…☆22Updated 12 years ago
- The new home for CnC Tests and Framework Libaries☆57Updated 3 months ago
- A High-Throughput Parallel Lossless Compressor for Scientific Data☆64Updated 2 years ago
- A runtime SPIR-V assembler☆43Updated 2 years ago
- ☆56Updated 6 months ago
- Performance Counter Measurements at the cycle granularity☆18Updated 3 years ago
- Performance Counter Reader☆124Updated 3 months ago
- Intriman is a documentation generator that retargets the Intel Intrinsics Guide to other documentation formats☆28Updated 2 years ago
- Table of ARM SoC and their features☆49Updated last week
- AVX-512 documentation beyond what Intel provides☆47Updated last year
- C implementation of the L-Mul f32/f16 multiplications from paper: https://arxiv.org/html/2410.00907☆27Updated 5 months ago
- ☆56Updated last week
- GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs☆14Updated last year
- A collection of (public) notes on assorted topics☆75Updated 2 weeks ago
- ROB size testing utility☆144Updated 3 years ago
- LLVM AMDGPU Assembler Helper Tools☆111Updated 7 years ago
- A small library and kernel module for easy access to x86 performance monitor counters under Linux.☆96Updated 11 months ago