satishphd / Teaching-Intel-Intrinsics-for-SIMD-Parallelism
Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class
☆15Updated 2 months ago
Alternatives and similar repositories for Teaching-Intel-Intrinsics-for-SIMD-Parallelism
Users that are interested in Teaching-Intel-Intrinsics-for-SIMD-Parallelism are comparing it to the libraries listed below
Sorting:
- SYCL Reference Manual☆27Updated last year
- Little OpenMP Library☆160Updated 2 years ago
- MLIR-based toolkit targeting intel heterogeneous hardware☆41Updated 2 months ago
- ☆56Updated last month
- SYCL Conformance Tests☆70Updated last week
- SYCL Benchmark Suite☆64Updated 2 months ago
- SYCL Open Source Specification☆134Updated last week
- Companion Repository for the Lecture Slides for the Clang Libraries☆100Updated last month
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆108Updated last week
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆52Updated last month
- Library with JIT (Just-in-time) compilation support to optimize performance of small and medium matrix multiplication☆14Updated 4 years ago
- Task graph-based asynchronous programming system using C++ coroutine☆89Updated last year
- A header only library implementing common mathematical functions using SIMD intrinsics☆105Updated 3 months ago
- High-level C++ for Accelerator Clusters☆145Updated 3 weeks ago
- Collaborating on papers for the ISO C++ committee - public repo☆26Updated 9 months ago
- Distributed ranges is a generalization of C++ ranges for distributed data structures.☆50Updated 2 weeks ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆87Updated last week
- RV: A Unified Region Vectorizer for LLVM☆107Updated 3 months ago
- performance experiments for C++ exception handling☆30Updated 3 years ago
- pika is a C++ tasking library built on std::execution with fibers, CUDA, HIP, and MPI support.☆73Updated this week
- ☆44Updated this week
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆62Updated 6 months ago
- TPP experimentation on MLIR for linear algebra☆128Updated this week
- InstLatX64_Demo☆43Updated last week
- My notes on various HPC papers.☆22Updated 2 years ago
- Conversions to MLIR EmitC☆128Updated 5 months ago
- immintrin_dbg.h is an include file, a wrapper around immintrin.h. It implements most of AVX, AVX2, AVX-512 vector intrinsics to enable so…☆57Updated 2 years ago
- ☆29Updated 2 years ago
- Lightweight recording and sampling of performance counters for specific code segments directly from your C++ application.☆63Updated this week
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆81Updated 5 years ago