StanfordLegion / legion
The Legion Parallel Programming System
☆709Updated last week
Alternatives and similar repositories for legion:
Users that are interested in legion are comparing it to the libraries listed below
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆865Updated this week
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,286Updated 10 months ago
- RAJA Performance Portability Layer (C++)☆507Updated this week
- This is a set of simple programs that can be used to explore the features of a parallel platform.☆423Updated this week
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆200Updated 3 months ago
- The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.☆210Updated this week
- CUSP : A C++ Templated Sparse Matrix Library☆411Updated 4 months ago
- ☆520Updated this week
- STREAM, for lots of devices written in many programming models☆328Updated 6 months ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆527Updated this week
- A code generator for array-based code on CPUs and GPUs☆597Updated this week
- Programmable CUDA/C++ GPU Graph Analytics☆1,012Updated 7 months ago
- Patterns and behaviors for GPU computing☆1,705Updated 2 years ago
- HPCToolkit performance tools: measurement and analysis components☆341Updated 3 weeks ago
- Kernel Tuner☆323Updated this week
- RAPIDS Memory Manager☆551Updated this week
- Caliper is an instrumentation and performance profiling library☆371Updated this week
- GraphIt - A High-Performance Domain Specific Language for Graph Analytics☆376Updated 2 years ago
- ☆131Updated last year
- CUDA Kernel Benchmarking Library☆585Updated 3 months ago
- The Foundation for All Legate Libraries☆205Updated last week
- oneAPI Math Library (oneMath)☆647Updated last week
- Abstraction Library for Parallel Kernel Acceleration☆366Updated 3 weeks ago
- Portable and vendor neutral framework for parallel programming on heterogeneous platforms.☆413Updated last week
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,733Updated last year
- An application-focused API for memory management on NUMA & GPU architectures☆347Updated this week
- ☆408Updated this week
- High-performance automatic differentiation of LLVM and MLIR.☆1,350Updated this week
- Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template…☆358Updated 7 months ago
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆638Updated 2 weeks ago