daadaada/turingas

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/daadaada/turingas)

daadaada / turingas

Assembler for NVIDIA Volta and Turing GPUs

☆246

Alternatives and similar repositories for turingas

Users that are interested in turingas are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cloudcores / CuAssembler
View on GitHub
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆609Apr 20, 2023Updated 3 years ago
PAA-NCIC / PPoPP2017_artifact
View on GitHub
Third party assembler and GEMM library for NVIDIA Kepler GPU
☆86Oct 8, 2019Updated 6 years ago
daadaada / gas
View on GitHub
☆49Dec 11, 2020Updated 5 years ago
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,073Jan 3, 2023Updated 3 years ago
xiuxiazhang / KeplerAs
View on GitHub
An Open Source Kepler GPU Assembler
☆22Jan 23, 2017Updated 9 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sjfeng1999 / gpu-arch-microbenchmark
View on GitHub
Dissecting NVIDIA GPU Architecture
☆125Jul 11, 2022Updated 4 years ago
hyqneuron / asfermi
View on GitHub
assembler for NVIDIA FERMI. Imported from Google Code
☆77Mar 22, 2015Updated 11 years ago
hkust-adsl / gass
View on GitHub
☆43Apr 3, 2022Updated 4 years ago
decodecudabinary / Decoding-CUDA-Binary
View on GitHub
☆55Nov 21, 2019Updated 6 years ago
QianyanTech / NBAssembler
View on GitHub
Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.
☆96Feb 23, 2023Updated 3 years ago
GVProf / GVProf
View on GitHub
GVProf: A Value Profiler for GPU-based Clusters
☆54Mar 24, 2024Updated 2 years ago
Yinghan-Li / YHs_Sample
View on GitHub
Yinghan's Code Sample
☆365Jul 25, 2022Updated 3 years ago
apuaaChen / vectorSparse
View on GitHub
☆32Aug 24, 2022Updated 3 years ago
BoyuanFeng / APNN-TC
View on GitHub
☆20Aug 26, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lixiuhong / batched_gemm
View on GitHub
☆40Feb 28, 2020Updated 6 years ago
sderek / CUDAAdvisor
View on GitHub
CUDAAdvisor: a GPU profiling tool
☆53Aug 24, 2018Updated 7 years ago
codyjrivera / tsm2x-imp
View on GitHub
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆35Jul 28, 2020Updated 5 years ago
RRZE-HPC / gpu-benches
View on GitHub
collection of benchmarks to measure basic GPU capabilities
☆530Oct 24, 2025Updated 8 months ago
ConvolutedDog / HyFiSS
View on GitHub
HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs
☆42Dec 9, 2024Updated last year
0xD0GF00D / DocumentSASS
View on GitHub
Unofficial description of the CUDA assembly (SASS) instruction sets.
☆223Jul 18, 2025Updated last year
NVlabs / NVBit
View on GitHub
☆341Apr 6, 2026Updated 3 months ago
PAA-NCIC / GSWITCH
View on GitHub
A pattern-based algorithmic autotuner for graph processing on GPUs.
☆33Jun 25, 2025Updated last year
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,104Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kuterd / nv_isa_solver
View on GitHub
Nvidia Instruction Set Specification Generator
☆344Jul 9, 2024Updated 2 years ago
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
illinois-impact / klap
View on GitHub
A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches
☆15Jun 21, 2019Updated 7 years ago
redplait / denvdis
View on GitHub
NVidia sass disassembler/inline patcher
☆89Updated this week
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
View on GitHub
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
☆420Jan 2, 2025Updated last year
NVIDIA / nvbench
View on GitHub
CUDA Kernel Benchmarking Library
☆901Updated this week
ap-hynninen / cutt
View on GitHub
CUDA Tensor Transpose (cuTT) library
☆55Aug 10, 2017Updated 8 years ago
pku-liang / FlexTensor
View on GitHub
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
☆184Apr 25, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
gpgpu-sim / gpgpu-sim_distribution
View on GitHub
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for…
☆1,674Feb 15, 2025Updated last year
HPMLL / DTC-SpMM_ASPLOS24
View on GitHub
☆47Jun 19, 2024Updated 2 years ago
sunlex0717 / DissectingTensorCores
View on GitHub
☆114Apr 19, 2024Updated 2 years ago
LeiWang1999 / tvm_gpu_gemm
View on GitHub
play gemm with tvm
☆91Jul 22, 2023Updated 2 years ago
xingyul / sparse-winograd-cnn
View on GitHub
Efficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)
☆191May 7, 2019Updated 7 years ago
xuqiantong / CUDA-Winograd
View on GitHub
Fast CUDA Kernels for ResNet Inference.
☆183May 26, 2019Updated 7 years ago
google-research / sputnik
View on GitHub
A library of GPU kernels for sparse matrix operations.
☆289Nov 24, 2020Updated 5 years ago