berenger-eu / avx-512-sort
Fast AVX512 (AVX-512) quicksort + bitonic sort.
☆25Updated 2 years ago
Related projects: ⓘ
- InstLatX64_Demo☆41Updated last month
- AVX512F and AVX2 versions of quick sort☆102Updated 6 years ago
- User-space Page Management☆102Updated last month
- A small library and kernel module for easy access to x86 performance monitor counters under Linux.☆91Updated 4 months ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆57Updated last year
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) API☆85Updated last week
- ☆32Updated 2 months ago
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆54Updated last month
- Very low-overhead timer/counter interfaces for C on Intel 64 processors.☆116Updated 4 years ago
- CERE: Codelet Extractor and REplayer☆40Updated 11 months ago
- Benchmarks for auto-vectorization and revectorization, including both hand-vectorized and scalar code☆24Updated 5 years ago
- Testing memory-level parallelism☆64Updated 6 months ago
- ROB size testing utility☆128Updated 2 years ago
- Generic Automatic Parallel Profiler☆28Updated 3 years ago
- A Benchmark Toolkit for Assembly Instructions Using the LLVM JIT☆16Updated 3 years ago
- Programatically obtain information about the pages backing a given memory region☆71Updated 2 years ago
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆107Updated this week
- Microbenchmarks for Aarch64 (Cortex A53)☆12Updated last year
- Record "perf" performance metrics for individual functions/regions of an ELF binary.☆69Updated 8 months ago
- Library with JIT (Just-in-time) compilation support to optimize performance of small and medium matrix multiplication☆12Updated 3 years ago
- Sample program for article "SIMD-ized searching in unique constant dictionary" (http://0x80.pl/articles/simd-search.html)☆50Updated 7 years ago
- Ocolos is the first online code layout optimization system for unmodified applications written in unmanaged languages.☆51Updated 10 months ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆183Updated 7 months ago
- A trivial Linux kernel module to execute WBINVD on demand☆24Updated 9 months ago
- CPU Ultimate Latency Test.☆103Updated last year
- Quick sort code using AVX2 instructions☆67Updated 7 years ago
- ssmem is a simple object-based memory allocator with epoch-based garbage collection☆34Updated 8 years ago
- Predator: Predictive False Sharing Detection☆21Updated 10 years ago
- Tools and Reference Code for Intel Optimizations (eg Large Pages)☆130Updated 3 weeks ago
- ☆30Updated 2 years ago