Leonardo-Ding/gpu_sgemm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Leonardo-Ding/gpu_sgemm)

Leonardo-Ding / gpu_sgemm

☆17

Alternatives and similar repositories for gpu_sgemm

Users that are interested in gpu_sgemm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sammi / bazel-to-msbuild
View on GitHub
Generate visual studio solution from a bazel workspace.
☆13Jan 19, 2022Updated 4 years ago
xurui / SiamRPNTracker
View on GitHub
☆71Apr 1, 2022Updated 4 years ago
caslab-NCKU / CASLab-GPU-SIM
View on GitHub
CASLab-GPU simulator in SystemC
☆11May 29, 2020Updated 6 years ago
daadaada / gas
View on GitHub
☆49Dec 11, 2020Updated 5 years ago
decodecudabinary / Decoding-CUDA-Binary
View on GitHub
☆55Nov 21, 2019Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
ArchC / riscv
View on GitHub
RISC-V processor model
☆11Nov 10, 2020Updated 5 years ago
harshasrisri / tomasulo
View on GitHub
An application to simulate Tomasulo's algorithm
☆11Jan 16, 2014Updated 12 years ago
lixiuhong / batched_gemm
View on GitHub
☆40Feb 28, 2020Updated 6 years ago
BradMcDanel / sdgp
View on GitHub
☆10Feb 1, 2022Updated 4 years ago
codyjrivera / tsm2x-imp
View on GitHub
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆35Jul 28, 2020Updated 5 years ago
Stefan20162016 / maxas-explained
View on GitHub
maxas Scott Grey's maxas assembler sgemm explaining the (for me) missing parts https://github.com/NervanaSystems/maxas
☆17Dec 22, 2018Updated 7 years ago
KestrelComputer / polaris
View on GitHub
RISC-V RV64IS-compatible processor for the Kestrel-3
☆21Feb 24, 2023Updated 3 years ago
NVIDIA / healthcare-on-tap-TRT-TRITON-demo
View on GitHub
Demonstration of the use of TensorRT and TRITON
☆16Feb 9, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
YusukeNagasaka / Batched-SpMM
View on GitHub
New batched algorithm for sparse matrix-matrix multiplication (SpMM)
☆16May 7, 2019Updated 7 years ago
csehydrogen / Winograd-OpenCL
View on GitHub
Winograd-based convolution implementation in OpenCL
☆29Jan 22, 2017Updated 9 years ago
GPUPeople / spECK
View on GitHub
Efficient SpGEMM on GPU using CUDA and CSR
☆61Jul 18, 2023Updated 3 years ago
GVProf / GVProf
View on GitHub
GVProf: A Value Profiler for GPU-based Clusters
☆54Mar 24, 2024Updated 2 years ago
glacierx / rproxy
View on GitHub
A blazing fast, cross-platform TCP & UDP proxy with automatic DNS re-resolution and a built-in terminal UI config editor. Zero-downtime h…
☆13Apr 21, 2026Updated 3 months ago
hebench / reference-seal-backend
View on GitHub
The SEAL-CPU backend is a Reference backend engine for HEBench which is a shared library that implements the required functions specified…
☆11Mar 3, 2023Updated 3 years ago
yaozhewei / MLPruning
View on GitHub
MLPruning, PyTorch, NLP, BERT, Structured Pruning
☆20Jun 29, 2021Updated 5 years ago
Infinite-Code / PyChart
View on GitHub
Repo for PyChart 1.39, refs http://download.gna.org/pychart/
☆10Sep 29, 2014Updated 11 years ago
daadaada / turingas
View on GitHub
Assembler for NVIDIA Volta and Turing GPUs
☆246Jan 13, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zhenlin36 / scatter_gather_aes_cuda
View on GitHub
A High-Performance Side-Channel-Resistant AES on GPUs
☆13May 9, 2019Updated 7 years ago
dglai / FeatGraph
View on GitHub
Sparse kernels for GNNs based on TVM
☆17Nov 18, 2020Updated 5 years ago
AlphaSparse / Library
View on GitHub
A sparse BLAS lib supporting multiple backends
☆51Mar 18, 2026Updated 4 months ago
ax-jason / luafixmath
View on GitHub
Lua binding for fixmath lib
☆21Jan 4, 2019Updated 7 years ago
MatanHamilis / one_stencil
View on GitHub
Multiple 1-stencil implementations using nvidia cuda.
☆12Dec 2, 2017Updated 8 years ago
glampert / GLProxy
View on GitHub
Single source file OpenGL proxy/interceptor skeleton.
☆16Nov 23, 2015Updated 10 years ago
finallyjustice / codereading
View on GitHub
Source code comments
☆11Jul 12, 2026Updated last week
mc2-project / muse
View on GitHub
Secure Inference Resilient Against Malicious Clients
☆14May 3, 2022Updated 4 years ago
youben11 / TFHE
View on GitHub
Implementation of the TFHE homomorphic encryption scheme.
☆12May 14, 2021Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
veltavid / HNPFuzzer
View on GitHub
☆12Jan 30, 2024Updated 2 years ago
WeCase / rpweibo
View on GitHub
cURL + Python Weibo Wrapper.
☆10Dec 8, 2017Updated 8 years ago
gangliao / TIGER
View on GitHub
implement a full compiler based on c++ 11
☆21Apr 24, 2017Updated 9 years ago
dskarlatos / ElasticCuckooHashing
View on GitHub
(elastic) cuckoo hashing
☆17Jun 20, 2020Updated 6 years ago
cloudcores / CuAssembler
View on GitHub
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆609Apr 20, 2023Updated 3 years ago
lakeman / trevisor
View on GitHub
Trevisor - A single guest hypervisor with full disk encryption
☆19Sep 19, 2016Updated 9 years ago
schneems / tiny_queue
View on GitHub
a tiny example of a threadsafe queue in C
☆18Aug 3, 2017Updated 8 years ago