c3sr/tcu_scope

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/c3sr/tcu_scope)

c3sr / tcu_scope

☆50

Alternatives and similar repositories for tcu_scope

Users that are interested in tcu_scope are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

temporal-hpc / reduction-tensor-cores
View on GitHub
Fast GPU based tensor core reductions
☆12Jan 13, 2023Updated 3 years ago
codyjrivera / tsm2x-imp
View on GitHub
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆35Jul 28, 2020Updated 5 years ago
apuaaChen / vectorSparse
View on GitHub
☆32Aug 24, 2022Updated 3 years ago
BoyuanFeng / APNN-TC
View on GitHub
☆20Aug 26, 2021Updated 4 years ago
MatanHamilis / one_stencil
View on GitHub
Multiple 1-stencil implementations using nvidia cuda.
☆12Dec 2, 2017Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
oresths / tSparse
View on GitHub
A GPU algorithm for sparse matrix-matrix multiplication
☆74Oct 1, 2020Updated 5 years ago
TiledTensor / TiledCUDA
View on GitHub
We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …
☆192Jan 28, 2025Updated last year
TiledTensor / TiledLower
View on GitHub
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆13Nov 23, 2024Updated last year
YashasSamaga / ConvolutionBuildingBlocks
View on GitHub
GEMM and Winograd based convolutions using CUTLASS
☆28Jul 15, 2020Updated 6 years ago
YusukeNagasaka / Batched-SpMM
View on GitHub
New batched algorithm for sparse matrix-matrix multiplication (SpMM)
☆16May 7, 2019Updated 7 years ago
ParCIS / Magicube
View on GitHub
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
☆92Nov 23, 2022Updated 3 years ago
ZhangJingrong / gpu_topK_benchmark
View on GitHub
GPU TopK Benchmark
☆18Dec 19, 2024Updated last year
shriramsb / vdnn-plus-plus
View on GitHub
Implementation of vDNN++; an improvement over vDNN
☆18Dec 7, 2018Updated 7 years ago
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
xxcclong / GNN-Computing
View on GitHub
Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"
☆42Nov 16, 2021Updated 4 years ago
itzmeanjan / ff-gpu
View on GitHub
Finite Field Operations on GPGPU
☆15Jul 23, 2023Updated 2 years ago
microsoft / FractalTensor
View on GitHub
FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …
☆32Dec 21, 2024Updated last year
nullplay / Unified-Convolution-Framework
View on GitHub
☆10Apr 24, 2023Updated 3 years ago
billmuch / matmul_perf_test
View on GitHub
☆15Apr 15, 2022Updated 4 years ago
frostt-tensor / tensor_parser
View on GitHub
A package for constructing sparse tensors from CSV-like data sources.
☆11Dec 24, 2017Updated 8 years ago
escalab / RTSpMSpM
View on GitHub
☆25Apr 13, 2025Updated last year
microsoft / cusync
View on GitHub
☆27Feb 20, 2024Updated 2 years ago
decodecudabinary / Decoding-CUDA-Binary
View on GitHub
☆55Nov 21, 2019Updated 6 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hgyhungry / ge-spmm
View on GitHub
☆115Jul 3, 2021Updated 5 years ago
YukeWang96 / TC-GNN_ATC23
View on GitHub
Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
☆58Oct 16, 2023Updated 2 years ago
SuperScientificSoftwareLaboratory / DASP
View on GitHub
Source code of the SC '23 paper: "DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multipli…
☆29Jun 18, 2024Updated 2 years ago
lixiuhong / batched_gemm
View on GitHub
☆40Feb 28, 2020Updated 6 years ago
getianao / ngAP
View on GitHub
ngAP's artifact for ASPLOS'24
☆25Jul 29, 2025Updated 11 months ago
OSU-STARLAB / UVM_benchmark
View on GitHub
☆34Sep 9, 2020Updated 5 years ago
pnnl / TCBNN
View on GitHub
☆39Jul 25, 2022Updated 3 years ago
YukeWang96 / GNNAdvisor_OSDI21
View on GitHub
Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
☆71Mar 2, 2023Updated 3 years ago
aoli-al / HFuse
View on GitHub
Horizontal Fusion
☆24Jan 7, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sderek / CUDAAdvisor
View on GitHub
CUDAAdvisor: a GPU profiling tool
☆53Aug 24, 2018Updated 7 years ago
wmmae / wmma_extension
View on GitHub
An extension library of WMMA API (Tensor Core API)
☆115Jul 12, 2024Updated 2 years ago
wpybtw / Skywalker
View on GitHub
☆12Dec 17, 2023Updated 2 years ago
illinois-impact / EMOGI
View on GitHub
☆26Dec 4, 2020Updated 5 years ago
gpgpu-sim / cutlass-gpgpu-sim
View on GitHub
☆28Oct 26, 2019Updated 6 years ago
YukeWang96 / MGG_OSDI23
View on GitHub
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…
☆40Mar 17, 2024Updated 2 years ago
microsoft / ConvStencil
View on GitHub
☆37Apr 10, 2024Updated 2 years ago