Portable and Flexible DGEMM Library for GPUs (OpenCL, CUDA, CAL) with special support for HPL
☆17Apr 5, 2018Updated 7 years ago
Alternatives and similar repositories for caldgemm
Users that are interested in caldgemm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- High Performance Linpack for GPUs (Using OpenCL, CUDA, CAL)☆92Oct 22, 2015Updated 10 years ago
- GPU implementation of classical molecular dynamics proxy application.☆31Jan 30, 2017Updated 9 years ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 7 months ago
- An HPL-AI implementation for Fugaku☆23Jun 29, 2021Updated 4 years ago
- ☆11Aug 8, 2021Updated 4 years ago
- Rapid HPC Orchestration in the Cloud☆28Oct 3, 2023Updated 2 years ago
- OpenCL porting of the GROMACS molecular simulation toolkit☆27Sep 5, 2015Updated 10 years ago
- A neutral particle transport mini-app to study performance of sweeps on unstructured, 3D tetrahedral meshes.☆19Sep 20, 2022Updated 3 years ago
- Open source of an IBM Optimized version of the HPCG benchmark.☆17Sep 17, 2025Updated 6 months ago
- Set of OpenCL microbenchmarks☆29Nov 19, 2025Updated 4 months ago
- EPCC I/O benchmarking applications☆12Dec 15, 2021Updated 4 years ago
- GPU Optimization and Memory Abstraction Framework☆32Oct 31, 2019Updated 6 years ago
- Tools to run and parse MKL verbose mode☆18Jun 28, 2022Updated 3 years ago
- JIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal☆13Aug 6, 2025Updated 7 months ago
- A python script that reads in a fortran 77 (.f or .F) fixed form file and converts it to a free form Fortran 90 file (.f90 or .F90).☆25Apr 12, 2016Updated 9 years ago
- A complete Scheme R5RS implementation, designed to be embedded in to C and C++ applications.☆17Feb 21, 2015Updated 11 years ago
- A proposal for a standard parallel algorithms library for ISO C++.☆22Feb 28, 2014Updated 12 years ago
- simple port of hpl-2.0 to use NVIDIA GPU accelation with CUBLAS☆29May 13, 2013Updated 12 years ago
- Source of BLAS via BLIS☆13May 6, 2024Updated last year
- Modern Fortran wrappers around MPI routines☆36Dec 17, 2025Updated 3 months ago
- Multigrid Methods - An Overview, A lecture series at Imperial College☆26Jan 6, 2025Updated last year
- ☆14Mar 21, 2019Updated 7 years ago
- C and NVidia CUDA code for multi-view deconvolution☆11Jul 25, 2016Updated 9 years ago
- ROCm Command Line Profiler - Updated moved to https://github.com/GPUOpen-Tools/RCP☆10Aug 24, 2017Updated 8 years ago
- Compute applications.☆25Dec 12, 2019Updated 6 years ago
- Workflow management system for the automated and distributed analysis of large-scale experimental data.☆13Oct 3, 2024Updated last year
- Example on long write (long characteristic)☆12Sep 3, 2015Updated 10 years ago
- GPU Debugging SDK for ROCm☆10Mar 21, 2019Updated 7 years ago
- QUICK, a GPU-enabled ab intio quantum chemistry software. Now move to the main branch: https://github.com/merzlab/QUICK☆11Jan 19, 2015Updated 11 years ago
- C++1y coroutine library.☆17Mar 26, 2018Updated 7 years ago
- Introduction to OpenACC☆30Jan 25, 2021Updated 5 years ago
- A Monte Carlo Neutron Transport Mini-App☆15Apr 15, 2019Updated 6 years ago
- A tool allowing students of Coursera's Heterogeneous Parallel Programming to work on homework using a machine without a CUDA GPU.☆11Mar 11, 2015Updated 11 years ago
- ☆14Aug 4, 2022Updated 3 years ago
- C library containing high resolution timer implementation for several platforms.☆10Oct 20, 2020Updated 5 years ago
- Code examples for the CUDA workshop☆36Sep 19, 2022Updated 3 years ago
- A BUDE virtual-screening benchmark, in many programming models☆30Oct 15, 2024Updated last year
- ☆29Dec 16, 2022Updated 3 years ago
- Vectorization EDSL library☆15Jun 24, 2019Updated 6 years ago