StanfordLegion / legion
The Legion Parallel Programming System
☆675Updated last week
Related projects: ⓘ
- This is a set of simple programs that can be used to explore the features of a parallel platform.☆405Updated 5 months ago
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,230Updated 5 months ago
- A code generator for array-based code on CPUs and GPUs☆576Updated last week
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆839Updated this week
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆513Updated 3 months ago
- ☆465Updated this week
- RAJA Performance Portability Layer (C++)☆465Updated this week
- GraphIt - A High-Performance Domain Specific Language for Graph Analytics☆366Updated last year
- DaCe - Data Centric Parallel Programming☆490Updated this week
- HPCToolkit performance tools: measurement and analysis components☆330Updated this week
- RAPIDS Memory Manager☆472Updated this week
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆528Updated last month
- Programmable CUDA/C++ GPU Graph Analytics☆973Updated last month
- The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.☆203Updated this week
- Assembler for NVIDIA Maxwell architecture☆942Updated last year
- Portable and vendor neutral framework for parallel programming on heterogeneous platforms.☆386Updated last month
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,669Updated 11 months ago
- STREAM, for lots of devices written in many programming models☆323Updated 3 weeks ago
- Patterns and behaviors for GPU computing☆1,638Updated 2 years ago
- A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology☆854Updated 2 months ago
- The Foundation for All Legate Libraries☆186Updated last week
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆187Updated 2 months ago
- CUDA Kernel Benchmarking Library☆482Updated 3 months ago
- Official MPICH Repository☆538Updated this week
- Caliper is an instrumentation and performance profiling library☆343Updated this week
- QUDA is a library for performing calculations in lattice QCD on GPUs.☆289Updated this week
- CUSP : A C++ Templated Sparse Matrix Library☆400Updated 8 months ago
- common in-memory tensor structure☆890Updated last week
- Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction☆504Updated 5 years ago
- Abstraction Library for Parallel Kernel Acceleration☆349Updated this week