linnanwang/BLASX

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/linnanwang/BLASX)

linnanwang / BLASX

a heterogeneous multiGPU level-3 BLAS library

☆46

Alternatives and similar repositories for BLASX

Users that are interested in BLASX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GPUPeople / spECK
View on GitHub
Efficient SpGEMM on GPU using CUDA and CSR
☆61Jul 18, 2023Updated 3 years ago
jeng1220 / cuGemmProf
View on GitHub
A simple tool to profile performance of multiple combinations of GEMM of cuBLAS
☆25Feb 9, 2021Updated 5 years ago
GPUPeople / faimGraph
View on GitHub
This code base represents "faimGraph: High Performance Management of Fully-dynamic Graphs under tight Memory Constraints on the GPU"
☆16Apr 23, 2021Updated 5 years ago
CNugteren / CLTune
View on GitHub
CLTune: An automatic OpenCL & CUDA kernel tuner
☆186Dec 12, 2022Updated 3 years ago
flame / fmm-gen
View on GitHub
Generating Families of Practical Fast Matrix Multiplication Algorithms
☆12Jul 7, 2017Updated 9 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
timlueth / SG-Lib-Matlab-Toolbox
View on GitHub
Solid Geometry Library Toolbox
☆22Sep 24, 2025Updated 10 months ago
ShadenSmith / splatt
View on GitHub
The Surprisingly ParalleL spArse Tensor Toolkit.
☆73Mar 3, 2022Updated 4 years ago
csenn / nn-visualizer
View on GitHub
☆52Mar 1, 2025Updated last year
mechanoChem / mechanoChemFEM
View on GitHub
mechanoChemFEM is a libarary for modeling of mechano-chemical problems using the finite element method. It is built upon Deal.ii, PetSc,…
☆12Oct 5, 2022Updated 3 years ago
hpcgarage / ParTI
View on GitHub
Parallel Tensor Infrastructure (ParTI!)
☆34Aug 18, 2020Updated 5 years ago
graphchallenge / GraphChallenge
View on GitHub
Graph Challenge
☆33Aug 19, 2019Updated 6 years ago
linnanwang / superneurons-release
View on GitHub
this is the release repository of superneurons
☆54Feb 13, 2021Updated 5 years ago
ARM-software / nomali-model
View on GitHub
A simple Mali 6xx/7xx register interface model that doesn't do any rendering.
☆13Jan 29, 2016Updated 10 years ago
apuaaChen / vectorSparse
View on GitHub
☆32Aug 24, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
pkumod / timingsubg
View on GitHub
Codes of the paper "Time Constrained Continuous Subgraph Search Over Streaming Graphs. ICDE 2019: 1082-1093". Authors: Youhuan Li, Lei Zo…
☆12Jun 18, 2021Updated 5 years ago
Alpine-DAV / vtk-h
View on GitHub
☆11Jul 13, 2022Updated 4 years ago
XiuYuLi / flexible-gemm
View on GitHub
flexible-gemm conv of deepcore
☆17Dec 2, 2019Updated 6 years ago
ogreen / GpuTriangleCounting
View on GitHub
Triangle Counting for the GPU using CUDA.
☆14Nov 5, 2015Updated 10 years ago
lunochod / caffe
View on GitHub
Caffe: a fast open framework for deep learning.
☆14Aug 26, 2015Updated 10 years ago
ParCoreLab / aCG
View on GitHub
GPU-accelerated linear solvers based on the conjugate gradient (CG) method, supporting NVIDIA and AMD GPUs with GPU-aware MPI, NCCL, RCCL…
☆16Mar 14, 2026Updated 4 months ago
bryancatanzaro / inplace
View on GitHub
CUDA and OpenMP implementations of C2R/R2C inplace transposition
☆49Feb 10, 2015Updated 11 years ago
eth-cscs / conflux
View on GitHub
Distributed Communication-Optimal LU-factorization Algorithm
☆12Aug 1, 2021Updated 4 years ago
springer13 / hptt
View on GitHub
High-Performance Tensor Transpose library
☆205May 13, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ContinuumIO / pydata-amsterdam2019-numba
View on GitHub
Numba GPU tutorial notebooks for PyData Amsterdam 2019
☆23Jun 25, 2026Updated last month
kokkos / kokkos-miniapps
View on GitHub
Mini-applications that exclusively use the Kokkos programming model
☆12Mar 21, 2023Updated 3 years ago
ECP-ExaGraph / grappolo
View on GitHub
OpenMP implementation of Graph Community Detection, with a number of parallel heuristics/approximate computing techniques
☆23Jun 15, 2023Updated 3 years ago
wsong83 / cppVCD
View on GitHub
cpp parser for reading a VCD (value change dump) file
☆10Jul 15, 2013Updated 13 years ago
kracwarlock / Movie-Recommender-and-Score-Prediction-System
View on GitHub
Provides Movie Recommendations on the MovieLens ml-100k dataset using Collaborative Filtering
☆11Nov 14, 2013Updated 12 years ago
SpRegTiling / sparse-register-tiling
View on GitHub
☆10Mar 2, 2024Updated 2 years ago
nv-legate / legate.hello
View on GitHub
Legate Hello World Pedagogical Library
☆10Apr 5, 2023Updated 3 years ago
codyjrivera / tsm2x-imp
View on GitHub
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆35Jul 28, 2020Updated 6 years ago
notini / csr-formatter
View on GitHub
C++ package to store Matrix Market (.mtx) file format sparse matrices in Compressed Row Storage (CSR) format.
☆17Oct 16, 2019Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
THU-numbda / SketchNE
View on GitHub
Embedding billion-scale networks accurately in one hour (TKDE paper 2023)
☆11Sep 26, 2023Updated 2 years ago
xnd-project / cuda-benchmarks
View on GitHub
Collection of CUDA benchmarks, with a focus on unified vs. explicit memory management.
☆21Oct 15, 2019Updated 6 years ago
AnonymousYWL / MYLIB
View on GitHub
☆18Apr 8, 2022Updated 4 years ago
viennacl / viennacl-dev
View on GitHub
Developer repository for ViennaCL. Visit http://viennacl.sourceforge.net/ for the latest releases.
☆295Nov 22, 2021Updated 4 years ago
zefiros-software / BSPLib
View on GitHub
BSPLib is a fast, and easy to use C++ implementation of the Bulk Synchronous Parallel (BSP) threading model.
☆22Jun 8, 2018Updated 8 years ago
AlphaSparse / Library
View on GitHub
A sparse BLAS lib supporting multiple backends
☆51Mar 18, 2026Updated 4 months ago
olcf / NVIDIA-tensor-core-examples
View on GitHub
☆20Nov 7, 2019Updated 6 years ago