berenger-eu / avx-512-sort
Fast AVX512 (AVX-512) quicksort + bitonic sort.
☆26Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for avx-512-sort
- InstLatX64_Demo☆41Updated 2 weeks ago
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆55Updated last month
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) API☆89Updated 2 months ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆57Updated last year
- AVX512F and AVX2 versions of quick sort☆105Updated 7 years ago
- User-space Page Management☆104Updated 3 months ago
- ROB size testing utility☆135Updated 2 years ago
- A small library and kernel module for easy access to x86 performance monitor counters under Linux.☆94Updated 6 months ago
- ☆35Updated 2 years ago
- Benchmarks for auto-vectorization and revectorization, including both hand-vectorized and scalar code☆26Updated 5 years ago
- CERE: Codelet Extractor and REplayer☆41Updated last year
- Parallel Memory Bandwidth Measurement / Benchmark Tool☆104Updated 2 years ago
- Very low-overhead timer/counter interfaces for C on Intel 64 processors.☆116Updated 5 years ago
- Predator: Predictive False Sharing Detection☆21Updated 10 years ago
- ☆35Updated 5 months ago
- A Benchmark Toolkit for Assembly Instructions Using the LLVM JIT☆16Updated 4 years ago
- Code used for generating charts and measurements of nontemporal stores☆9Updated 6 years ago
- assembler for NVIDIA FERMI. Imported from Google Code☆70Updated 9 years ago
- Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.☆38Updated 3 years ago
- A description of Minotaur can be found in https://arxiv.org/abs/2306.00229.☆95Updated 3 months ago
- The CLooG Code Generator in the Polyhedral Model☆43Updated last year
- UB-aware interpreter for LLVM debugging☆17Updated this week
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆102Updated last year
- Linux Cross-Memory Attach☆88Updated 2 months ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆107Updated last year
- A trivial Linux kernel module to execute WBINVD on demand☆24Updated 11 months ago
- CPU Ultimate Latency Test.☆106Updated last year
- Information about AVX-512 support on recent Intel processors☆43Updated 2 years ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆191Updated 3 weeks ago
- CUDAAdvisor: a GPU profiling tool☆48Updated 6 years ago