spcl / dace
DaCe - Data Centric Parallel Programming
☆497Updated this week
Related projects ⓘ
Alternatives and complementary repositories for dace
- Kernel Tuner☆286Updated this week
- A Data-Centric Compiler for Machine Learning☆82Updated 10 months ago
- CUDA Kernel Benchmarking Library☆513Updated 2 weeks ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆73Updated 4 months ago
- collection of benchmarks to measure basic GPU capabilities☆264Updated 4 months ago
- Unified Collective Communication Library☆205Updated this week
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆192Updated this week
- STREAM, for lots of devices written in many programming models☆325Updated 2 months ago
- This is a set of simple programs that can be used to explore the features of a parallel platform.☆411Updated this week
- A Python Compiler Design Toolkit☆272Updated this week
- ☆215Updated this week
- The Foundation for All Legate Libraries☆189Updated last month
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆180Updated this week
- Assembler for NVIDIA Volta and Turing GPUs☆200Updated 2 years ago
- ☆485Updated this week
- ☆224Updated 2 months ago
- Rich editor for SDFGs with included profiling and debugging, static analysis, and interactive optimization.☆19Updated last week
- Advanced Profiling and Analytics for AMD Hardware☆135Updated this week
- ☆228Updated this week
- Pluto: An automatic polyhedral parallelizer and locality optimizer☆274Updated 5 months ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆123Updated this week
- An out-of-tree MLIR dialect template.☆91Updated 2 months ago
- The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.☆206Updated this week
- C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!☆485Updated last month
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆850Updated this week
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆123Updated last year
- This is the top-level repository for the Accel-Sim framework.☆303Updated 2 weeks ago
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆309Updated this week
- A code generator for array-based code on CPUs and GPUs☆587Updated this week
- Data Parallel Extension for Numba☆77Updated last week