google / memcpy-gemm
☆16Updated 2 years ago
Alternatives and similar repositories for memcpy-gemm:
Users that are interested in memcpy-gemm are comparing it to the libraries listed below
- ☆56Updated last month
- Emulating DMA Engines on GPUs for Performance and Portability☆39Updated 9 years ago
- Tests and benchmarks for cudnn (and in the future, other nvidia libraries)☆53Updated 4 years ago
- npcomp - An aspirational MLIR based numpy compiler☆51Updated 4 years ago
- Portable 128-bit SIMD intrinsics☆58Updated last year
- ☆95Updated this week
- A copy of the Intel Cilk Plus runtime system with modifications to work with OpenCilk and its associated tools.☆12Updated 4 years ago
- SYCL Reference Manual☆27Updated last year
- Asynchronous Task and Memory Interface, or ATMI, is a runtime framework and programming model for heterogeneous CPU-GPU systems. It provi…☆67Updated last year
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated last year
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆108Updated this week
- ☆34Updated last year
- A sandbox for quick iteration and experimentation on projects related to IREE, MLIR, and LLVM☆57Updated last month
- ☆16Updated 5 years ago
- This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.☆55Updated this week
- Experiments and prototypes associated with IREE or MLIR☆50Updated 9 months ago
- SYCL-ML is a C++ library, implementing classical machine learning algorithms using SYCL.☆66Updated 5 years ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆62Updated this week
- Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research☆109Updated last year
- A C/C++ task-based programming model for shared memory and distributed parallel computing.☆71Updated 4 years ago
- This repo is a mirror of upstream https://github.com/llvm/llvm-project . Every three hours the main branch is mirrored from upstream. Pl…☆24Updated last year
- Tools and extensions for CUDA profiling☆65Updated 5 years ago
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆51Updated last year
- Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project☆25Updated 5 years ago
- ☆59Updated this week
- Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project☆34Updated 5 years ago
- MLIR-based partitioning system☆82Updated this week
- Kernel Tuning Toolkit☆59Updated last month
- Information about AVX-512 support on recent Intel processors☆45Updated 3 years ago
- ☆13Updated 3 years ago