jeffhammond / dpcpp-tutorialLinks
Intel Data Parallel C++ (and SYCL 2020) Tutorial.
☆93Updated 3 years ago
Alternatives and similar repositories for dpcpp-tutorial
Users that are interested in dpcpp-tutorial are comparing it to the libraries listed below
Sorting:
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 6 months ago
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆275Updated 3 months ago
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- SYCL Benchmark Suite☆65Updated 3 weeks ago
- Examples for using SYCL on CUDA☆62Updated last week
- SYCL Open Source Specification☆136Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆119Updated this week
- Scalable High-performance Algorithms and Data-structures☆132Updated last month
- RAJA Performance Suite☆117Updated this week
- Next generation LAPACK implementation for ROCm platform☆105Updated this week
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆109Updated 2 years ago
- SYCL Conformance Tests☆70Updated 2 weeks ago
- STREAM, for lots of devices written in many programming models☆344Updated 10 months ago
- Examples for HIP☆209Updated 7 months ago
- Kokkos C++ Performance Portability Programming Ecosystem: Profiling and Debugging Tools☆130Updated 3 weeks ago
- ☆247Updated last month
- An implementation of HIP that works on CPUs, across OSes.☆121Updated last year
- Next generation FFT implementation for ROCm☆195Updated this week
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- DLA-Future☆75Updated last month
- Full-speed Array of Structures access☆171Updated 2 years ago
- Advanced Profiling and Analytics for AMD Hardware☆159Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆172Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆84Updated 2 weeks ago
- Distributed ranges is a generalization of C++ ranges for distributed data structures.☆51Updated last week
- CUDA kernel author's tools☆111Updated 3 years ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆71Updated 2 years ago
- Next generation SPARSE implementation for ROCm platform☆129Updated this week
- Kernel Tuning Toolkit☆61Updated 2 weeks ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆106Updated 7 years ago