Tohoku-University-Takizawa-Lab / neoSYCLLinks
A SYCL Implementation for CPU and SX-Aurora TSUBASA
☆53Updated 2 years ago
Alternatives and similar repositories for neoSYCL
Users that are interested in neoSYCL are comparing it to the libraries listed below
Sorting:
- This is the git repository for RIKEN simulator designed to simulate the binary code for Fujitsu A64FX.☆36Updated 5 years ago
- Library of High Precision Sparse Matrix Operations Accelerated by SIMD☆42Updated 4 years ago
- Omni Compiler for C and Fortran programs with XcalableMP and OpenACC directives☆61Updated last year
- VEDA (VE Driver API)☆17Updated 4 months ago
- Another|Alternative|Awesome VE Offloading stack using ve-urpc☆14Updated last year
- ☆52Updated 4 years ago
- Itoyori: A distributed multi-threading runtime system for global-view fork-join task parallelism☆20Updated last year
- NLCPy : NumPy-like API accelerated with SX-Aurora TSUBASA☆15Updated last year
- llvm-project cloned from https://github.com/llvm/llvm-project and modified for VE☆19Updated last week
- ASM generation tool for GAS/NASM/MASM with Xbyak-like syntax in Python☆12Updated 3 months ago
- instruction-bench☆36Updated 2 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆84Updated last week
- Tutorials for ARM SVE on Docker☆43Updated 2 years ago
- Official BOLT Repository☆29Updated 10 months ago
- SYCL Reference Manual☆28Updated last year
- RAJA Performance Suite☆117Updated last week
- ROCm SPARSE marshalling library☆67Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆119Updated this week
- An HPL-AI implementation for Fugaku☆21Updated 3 years ago
- ☆15Updated 2 years ago
- SYCL Benchmark Suite☆65Updated this week
- Armv8 A64 Assembly & Intrinsics Guide Server☆25Updated last year
- mallocMC: Memory Allocator for Many Core Architectures☆56Updated last month
- OpenMP Offloading Validation & Verification Suite; Official repository. We have migrated from bitbucket!! For documentation, results, pub…☆58Updated 2 weeks ago
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆56Updated 2 months ago
- World championship code for Graph500☆25Updated last year
- CUDA Dynamic Memory Allocator for SOA Data Layout☆35Updated 3 years ago
- ReMPI (MPI Record-and-Replay)☆39Updated last year
- Next generation LAPACK implementation for ROCm platform☆103Updated this week
- Base container for developing C++ and Fortran HPC applications☆18Updated 3 years ago