haswelliris / CPC2018-GROMACSLinks
CPC2018第二届国产CPU并行应用挑战赛决赛
☆11Updated 7 years ago
Alternatives and similar repositories for CPC2018-GROMACS
Users that are interested in CPC2018-GROMACS are comparing it to the libraries listed below
Sorting:
- CSR5-based SpMV on CPUs, GPUs and Xeon Phi☆110Updated last year
- ☆98Updated 9 years ago
- A highly efficient library for GEMM operations on Sunway TaihuLight☆18Updated 5 years ago
- A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)☆22Updated 5 years ago
- A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs☆56Updated 4 years ago
- ☆19Updated 2 years ago
- A sparse BLAS lib supporting multiple backends☆49Updated 2 months ago
- High-performance, GPU-aware communication library☆86Updated last month
- 2018 并行计算课程 repo☆33Updated 4 years ago
- Asynchronous Multi-GPU Programming Framework☆48Updated 4 years ago
- Intermediate MPI lesson☆27Updated 2 years ago
- Parallel Tensor Infrastructure (ParTI!)☆33Updated 5 years ago
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆350Updated 2 months ago
- GPU Code optimizer for stencil computations. Refer to our IPDPS'19 paper for more details☆25Updated 6 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆145Updated 2 years ago
- PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems☆45Updated 6 months ago
- Domain-specific framework for performance analysis of parallel programs☆16Updated this week
- Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides (SpTRSM)☆14Updated 5 years ago
- 14 basic topics for VEGA64 performance optmization☆63Updated 4 years ago
- ☆14Updated 7 years ago
- CUDA Tensor Transpose (cuTT) library☆53Updated 8 years ago
- Stencil Probe - a stencil microbenchmark☆30Updated 13 years ago
- Parallelized and vectorized SpMV on Intel Xeon Phi (Knights Landing, AVX512, KNL)☆24Updated 2 years ago
- Efficient SpGEMM on GPU using CUDA and CSR☆59Updated 2 years ago
- A Deep Learning Framework customized for Sunway TaihuLight☆41Updated 7 years ago
- development repository for the open earth compiler☆82Updated 4 years ago
- PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …☆76Updated 3 months ago
- Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code☆63Updated 5 years ago
- Official HPCG benchmark source code☆340Updated last year
- Source code that accompanies The CUDA Handbook.☆566Updated 4 months ago