kunpengcompute / AvxToNeonLinks
Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.
☆130Updated 2 years ago
Alternatives and similar repositories for AvxToNeon
Users that are interested in AvxToNeon are comparing it to the libraries listed below
Sorting:
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆69Updated last year
- ☆59Updated last month
- GPU-Accelerated Lossless Data Compressors Survey☆121Updated 5 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆144Updated 2 years ago
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆230Updated last year
- ☆154Updated last week
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆91Updated last year
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆126Updated 3 years ago
- A profiler to disclose and quantify hardware features on GPUs.☆175Updated 3 years ago
- Intel® GPU Compute Samples☆109Updated 4 months ago
- Arm C Language Extensions (ACLE)☆118Updated this week
- ☆61Updated 3 years ago
- assembler for NVIDIA FERMI. Imported from Google Code☆76Updated 10 years ago
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆127Updated this week
- PROGRESS64 is a C library of scalable functions for concurrent programs, primarily focused on networking applications.☆95Updated 2 months ago
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆143Updated 6 months ago
- Intel® Data Mover Library (Intel® DML)☆96Updated 9 months ago
- Collection of synchronization micro-benchmarks and traces from infrastructure applications☆49Updated 5 months ago
- The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using S…☆481Updated 2 months ago
- Intel AVX-512简介☆54Updated 2 months ago
- Tools and Reference Code for Intel Optimizations (eg Large Pages)☆146Updated 4 months ago
- Conversion to/from half-precision floating point formats☆380Updated 5 months ago
- UADK (User space Accelerator Development Kit), is a user space framework for using accelerators. Active branch is 'master'.☆52Updated last week
- Intel® Query Processing Library (Intel® QPL)☆106Updated last month
- ☆144Updated this week
- Fast AVX512 (AVX-512) quicksort + bitonic sort.☆28Updated 3 years ago
- Simple benchmark for memory throughput and latency☆404Updated 2 years ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆59Updated 3 years ago
- Portable (POSIX/Windows/Emscripten) thread pool for C/C++☆388Updated last year
- The Farm-SVE package provides a header that implements the ARM C language extensions (ACLE) for the ARM Scalable Vector Extension (SVE) i…☆15Updated 2 years ago