masahi/tvm-cutlass-eval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/masahi/tvm-cutlass-eval)

masahi / tvm-cutlass-eval

☆41

Alternatives and similar repositories for tvm-cutlass-eval

Users that are interested in tvm-cutlass-eval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

masahi / torchscript-to-tvm
View on GitHub
☆68Mar 4, 2023Updated 3 years ago
CharlieCurry / tvm-learning
View on GitHub
TVM learning and research
☆13Jan 8, 2021Updated 5 years ago
Yongqi-Zhuo / triton-tvm
View on GitHub
Triton to TVM transpiler.
☆24Oct 14, 2024Updated last year
tlc-pack / TLCBench
View on GitHub
Benchmark scripts for TVM
☆75Mar 15, 2022Updated 4 years ago
apache / tvm-rfcs
View on GitHub
A home for the final text of all TVM RFCs.
☆111Sep 24, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
WuDan0399 / Integrate-NVDLA-and-TVM
View on GitHub
☆33Mar 6, 2023Updated 3 years ago
pnnl / TCBNN
View on GitHub
☆39Jul 25, 2022Updated 3 years ago
cmu-catalyst / collage
View on GitHub
System for automated integration of deep learning backends.
☆47Aug 15, 2022Updated 3 years ago
tlc-pack / relax
View on GitHub
☆193Mar 28, 2023Updated 3 years ago
Archermmt / tvm_walk_through
View on GitHub
code reading for tvm
☆75Jan 20, 2022Updated 4 years ago
lenLRX / AmpereSparseMatmul
View on GitHub
study of Ampere' Sparse Matmul
☆18Jan 10, 2021Updated 5 years ago
masahi / tvm-winograd
View on GitHub
Test winograd convolution written in TVM for CUDA and AMDGPU
☆41Oct 12, 2018Updated 7 years ago
Tencent / BlazerML-tvm
View on GitHub
Tencent Distribution of TVM
☆16Apr 7, 2023Updated 3 years ago
tlc-pack / libflash_attn
View on GitHub
Standalone Flash Attention v2 kernel without libtorch dependency
☆113Sep 10, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
MoZeWei / moTuner
View on GitHub
☆10May 12, 2022Updated 4 years ago
wzh99 / relay-mlir
View on GitHub
An MLIR-based toy DL compiler for TVM Relay.
☆62Oct 16, 2022Updated 3 years ago
flashinfer-ai / cutlass-viz
View on GitHub
☆65Apr 26, 2025Updated last year
uwsampl / SparseTIR
View on GitHub
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆145Mar 31, 2023Updated 3 years ago
LeiWang1999 / tvm_gpu_gemm
View on GitHub
play gemm with tvm
☆91Jul 22, 2023Updated 3 years ago
Engineev / solutions
View on GitHub
My personal solutions to some textbook problems
☆12Feb 12, 2020Updated 6 years ago
Ratbuyer / h100-features
View on GitHub
☆18Mar 12, 2025Updated last year
pku-liang / AMOS
View on GitHub
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
☆125Oct 26, 2022Updated 3 years ago
MingliSun / MLIR-TVM
View on GitHub
☆13Nov 25, 2019Updated 6 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
efeslab / Atom
View on GitHub
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
☆344Jul 2, 2024Updated 2 years ago
lucifer1004 / VeloQ
View on GitHub
Agent-friendly GPU profile-query CLI
☆106Jun 22, 2026Updated last month
AniZpZ / smoothquant
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆11Dec 13, 2023Updated 2 years ago
uwsampl / sparsetir-artifact
View on GitHub
Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"
☆25Feb 24, 2023Updated 3 years ago
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
xxyux / SpInfer
View on GitHub
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
☆68Mar 25, 2025Updated last year
xuyifangreeneyes / mxcompiler
View on GitHub
Craft a toy compiler
☆10Aug 21, 2019Updated 6 years ago
JohndeVostok / APE
View on GitHub
A GPU FP32 computation method with Tensor Cores.
☆27Dec 8, 2025Updated 7 months ago
thu-pacman / PET
View on GitHub
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆126Jun 23, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
phillipstanleymarbell / Noisy-lang-compiler
View on GitHub
Noisy language compiler
☆17Jul 31, 2024Updated last year
chih-chun-chang / tvm-yolov3
View on GitHub
compile yolov3 in TVM
☆13Aug 14, 2023Updated 2 years ago
tsinghua-ideal / Syno
View on GitHub
Source code repository for ASPLOS '25 paper "Syno: Structured Synthesis for Neural Operators"
☆15Aug 31, 2025Updated 10 months ago
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated 11 months ago
onnx / optimizer
View on GitHub
ONNX Optimizer
☆825Updated this week
merrymercy / awesome-tensor-compilers
View on GitHub
A list of awesome compiler projects and papers for tensor computation and deep learning.
☆2,768Oct 19, 2024Updated last year
StrongSpoon / tvm.schedule
View on GitHub
examples for tvm schedule API
☆101Jun 12, 2023Updated 3 years ago