satishphd / Teaching-Intel-Intrinsics-for-SIMD-ParallelismLinks
Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class
☆15Updated 5 months ago
Alternatives and similar repositories for Teaching-Intel-Intrinsics-for-SIMD-Parallelism
Users that are interested in Teaching-Intel-Intrinsics-for-SIMD-Parallelism are comparing it to the libraries listed below
Sorting:
- performance experiments for C++ exception handling☆30Updated 3 years ago
- ☆58Updated 2 weeks ago
- Little OpenMP Library☆164Updated 2 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆111Updated last month
- SYCL Open Source Specification☆136Updated last week
- SYCL Conformance Tests☆70Updated last week
- An implementation of HIP that works on CPUs, across OSes.☆122Updated last year
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆118Updated this week
- Test the non-AVX, AVX2 and AVX-512 speeds across various active core counts☆225Updated 9 months ago
- Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.☆123Updated last year
- UME::SIMD A library for explicit simd vectorization.☆91Updated 7 years ago
- ☆144Updated last month
- SYCL Benchmark Suite☆65Updated last month
- ☆141Updated 3 weeks ago
- A fast implementation of log() and exp()☆53Updated 2 years ago
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆110Updated 3 weeks ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆55Updated 4 months ago
- ☆151Updated 2 weeks ago
- X86 CPU topics overview for developers , oriented towards performance☆200Updated 5 months ago
- Interchangeable AoS and SoA containers☆25Updated 2 years ago
- Task graph-based asynchronous programming system using C++ coroutine☆92Updated last year
- A lightweight memory allocator for hardware-accelerated machine learning☆157Updated 4 months ago
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆65Updated 9 months ago
- The Berkeley Container Library☆124Updated 2 years ago
- A comparative, extendable benchmarking suite for C and C++ hash-table libraries.☆35Updated last year
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆104Updated 2 weeks ago
- Short examples illustrating AVX2 intrinsics for simple tasks.☆96Updated last year
- a small lightweight std::execution work-alike☆65Updated 4 months ago
- SYCL Reference Manual☆28Updated last year
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆262Updated 6 months ago