spcl / dace
DaCe - Data Centric Parallel Programming
☆522Updated last week
Alternatives and similar repositories for dace:
Users that are interested in dace are comparing it to the libraries listed below
- Unified Collective Communication Library☆246Updated this week
- A Data-Centric Compiler for Machine Learning☆82Updated last year
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆206Updated 2 weeks ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆80Updated 2 weeks ago
- Kernel Tuner☆326Updated last week
- Pluto: An automatic polyhedral parallelizer and locality optimizer☆288Updated 3 weeks ago
- A code generator for array-based code on CPUs and GPUs☆602Updated last week
- A Python Compiler Design Toolkit☆336Updated this week
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆253Updated 3 weeks ago
- Rich editor for SDFGs with included profiling and debugging, static analysis, and interactive optimization.☆19Updated 2 months ago
- This is the top-level repository for the Accel-Sim framework.☆390Updated this week
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆480Updated last year
- Assembler for NVIDIA Volta and Turing GPUs☆216Updated 3 years ago
- This is a set of simple programs that can be used to explore the features of a parallel platform.☆427Updated 3 weeks ago
- CUDA Kernel Benchmarking Library☆618Updated this week
- collection of benchmarks to measure basic GPU capabilities☆354Updated 2 months ago
- C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!☆534Updated 6 months ago
- Stretching GPU performance for GEMMs and tensor contractions.☆235Updated last week
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆678Updated last month
- Python SYCL bindings and SYCL-based Python Array API library☆110Updated this week
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆134Updated last week
- The Foundation for All Legate Libraries☆212Updated this week
- RAJA Performance Suite☆117Updated last week
- ☆236Updated last week
- RAJA Performance Portability Layer (C++)☆512Updated last week
- SST Structural Simulation Toolkit Parallel Discrete Event Core and Services☆151Updated this week
- STREAM, for lots of devices written in many programming models☆333Updated 7 months ago
- Data Parallel Extension for Numba☆80Updated 5 months ago
- ☆240Updated 2 months ago
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆867Updated this week