satishphd / Teaching-Intel-Intrinsics-for-SIMD-ParallelismLinks
Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class
☆16Updated 10 months ago
Alternatives and similar repositories for Teaching-Intel-Intrinsics-for-SIMD-Parallelism
Users that are interested in Teaching-Intel-Intrinsics-for-SIMD-Parallelism are comparing it to the libraries listed below
Sorting:
- ☆59Updated this week
- Little OpenMP Library☆169Updated 3 years ago
- SYCL Conformance Tests☆70Updated this week
- A header only library implementing common mathematical functions using SIMD intrinsics☆114Updated 3 months ago
- SYCL Reference Manual☆28Updated last year
- ☆143Updated last week
- RV: A Unified Region Vectorizer for LLVM☆112Updated 6 months ago
- ☆154Updated last week
- SYCL Open Source Specification☆141Updated last month
- SYCL Benchmark Suite☆66Updated 5 months ago
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆117Updated 2 weeks ago
- A lightweight memory allocator for hardware-accelerated machine learning☆176Updated 2 months ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆44Updated 4 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆56Updated 9 months ago
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆126Updated 2 weeks ago
- Generate SQL from TableGen code - This is part of the tutorial "How to write a TableGen backend" in 2021 LLVM Developers' Meeting.☆34Updated 2 years ago
- An implementation of HIP that works on CPUs, across OSes.☆131Updated last year
- High-level C++ for Accelerator Clusters☆153Updated 3 weeks ago
- Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.☆128Updated last year
- Utilities to measure read access times of caches, memory, and hardware prefetches for simple and fused operations☆85Updated 2 years ago
- Companion Repository for the Lecture Slides for the Clang Libraries☆120Updated 3 months ago
- X86 CPU topics overview for developers , oriented towards performance☆203Updated 9 months ago
- performance experiments for C++ exception handling☆32Updated 3 years ago
- UME::SIMD A library for explicit simd vectorization.☆91Updated 7 years ago
- Source code for 'Modern Parallel Programming with C++ and Assembly' by Dan Kusswurm☆71Updated 3 years ago
- Very low-overhead timer/counter interfaces for C on Intel 64 processors.☆139Updated last month
- Utilities for accessing AMD's Machine-Readable GPU ISA Specifications.☆43Updated 2 months ago
- SYCL for Vitis: Experimental fusion of triSYCL with Intel SYCL oneAPI DPC++ up-streaming effort into Clang/LLVM☆122Updated last year
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆142Updated 5 months ago
- This is a mirror of the official libpfm4 git repository, https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/ with some local branc…☆69Updated last year