PAA-NCIC/PPoPP2017_artifact

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PAA-NCIC/PPoPP2017_artifact)

PAA-NCIC / PPoPP2017_artifact

Third party assembler and GEMM library for NVIDIA Kepler GPU

☆86

Alternatives and similar repositories for PPoPP2017_artifact

Users that are interested in PPoPP2017_artifact are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

daadaada / turingas
View on GitHub
Assembler for NVIDIA Volta and Turing GPUs
☆246Jan 13, 2022Updated 4 years ago
hyqneuron / asfermi
View on GitHub
assembler for NVIDIA FERMI. Imported from Google Code
☆77Mar 22, 2015Updated 11 years ago
xiuxiazhang / KeplerAs
View on GitHub
An Open Source Kepler GPU Assembler
☆22Jan 23, 2017Updated 9 years ago
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,074Jan 3, 2023Updated 3 years ago
PAA-NCIC / GSWITCH
View on GitHub
A pattern-based algorithmic autotuner for graph processing on GPUs.
☆33Jun 25, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cloudcores / CuAssembler
View on GitHub
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆609Apr 20, 2023Updated 3 years ago
daadaada / gas
View on GitHub
☆49Dec 11, 2020Updated 5 years ago
PAA-NCIC / DeepPerf
View on GitHub
DeepPerf is a set of cuda assembling developing tools
☆11Dec 19, 2018Updated 7 years ago
NVlabs / SASSI
View on GitHub
Flexible GPGPU instrumentation
☆91Oct 10, 2019Updated 6 years ago
gpgpu-sim / cutlass-gpgpu-sim
View on GitHub
☆28Oct 26, 2019Updated 6 years ago
decodecudabinary / Decoding-CUDA-Binary
View on GitHub
☆55Nov 21, 2019Updated 6 years ago
hkust-adsl / gass
View on GitHub
☆43Apr 3, 2022Updated 4 years ago
aditya4d / gemm-vega64
View on GitHub
Implement asm gemm on vega64 for 4096x4096 fp32 matrix
☆22Oct 12, 2019Updated 6 years ago
lixiuhong / batched_gemm
View on GitHub
☆40Feb 28, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sunlex0717 / DissectingTensorCores
View on GitHub
☆115Apr 19, 2024Updated 2 years ago
ekondis / gpumembench
View on GitHub
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
☆113Aug 12, 2017Updated 8 years ago
llnl / pLiner
View on GitHub
pLiner is a framework that helps programmers identify locations in the source of numerical code that are highly affected by compiler opti…
☆17Oct 27, 2023Updated 2 years ago
XiuYuLi / flexible-gemm
View on GitHub
flexible-gemm conv of deepcore
☆17Dec 2, 2019Updated 6 years ago
vancemiller / CUDA-preemption
View on GitHub
Experiments evaluating preemption on the NVIDIA Pascal architecture
☆16Nov 10, 2016Updated 9 years ago
cuMF / culda_cgs
View on GitHub
Efficient LDA solution on GPUs.
☆24Aug 20, 2018Updated 7 years ago
Stefan20162016 / maxas-explained
View on GitHub
maxas Scott Grey's maxas assembler sgemm explaining the (for me) missing parts https://github.com/NervanaSystems/maxas
☆17Dec 22, 2018Updated 7 years ago
wahibium / KFF
View on GitHub
Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels
☆14Aug 26, 2015Updated 10 years ago
nicolaswilde / cuda-tensorcore-hgemm
View on GitHub
☆160Dec 26, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
NMSU-PEARL / PPT-GPU
View on GitHub
Performance Prediction Toolkit for GPUs
☆41Mar 21, 2022Updated 4 years ago
hyln9 / GCNGEMM
View on GitHub
Optimized half precision gemm assembly kernels (deprecated due to ROCm)
☆47Jun 16, 2017Updated 9 years ago
sderek / CUDAAdvisor
View on GitHub
CUDAAdvisor: a GPU profiling tool
☆53Aug 24, 2018Updated 7 years ago
NVlabs / NVBit
View on GitHub
☆343Apr 6, 2026Updated 3 months ago
SunsetQuest / CudaPAD
View on GitHub
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
☆129Jan 17, 2023Updated 3 years ago
RRZE-HPC / asmbench
View on GitHub
A Benchmark Toolkit for Assembly Instructions Using the LLVM JIT
☆18Oct 26, 2020Updated 5 years ago
pigirons / conv3x3_m1
View on GitHub
This is a demo how to write a high performance convolution run on apple silicon
☆56Feb 8, 2022Updated 4 years ago
hummingtree / cuda-graph-with-dynamic-parameters
View on GitHub
☆17Aug 9, 2022Updated 3 years ago
JamesTheZ / VersaPipe
View on GitHub
A framework for pipelined computing on GPU
☆30Jul 17, 2019Updated 7 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
FindHao / drgpu
View on GitHub
A Top-Down Profiler for GPU Applications
☆23Feb 29, 2024Updated 2 years ago
forresti / convolution
View on GitHub
Communication-Minimizing 2D Convolution in GPU Registers
☆30Sep 21, 2013Updated 12 years ago
pku-liang / FlexTensor
View on GitHub
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
☆184Apr 25, 2022Updated 4 years ago
QianyanTech / NBAssembler
View on GitHub
Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.
☆96Feb 23, 2023Updated 3 years ago
escalab / GPTPU
View on GitHub
GPTPU for SC 2021
☆52Mar 22, 2023Updated 3 years ago
ap-hynninen / cutt
View on GitHub
CUDA Tensor Transpose (cuTT) library
☆55Aug 10, 2017Updated 8 years ago
SEP-Graph / sep-graph
View on GitHub
This is the repo of "SEP-Graph: Finding Shortest Execution Paths for Graph Processing under a Hybrid Framework on GPU"
☆14Dec 11, 2018Updated 7 years ago