LeiWang1999/TVM.CMakeExtend

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LeiWang1999/TVM.CMakeExtend)

LeiWang1999 / TVM.CMakeExtend

Tutorials of Extending and importing TVM with CMAKE Include dependency.

☆16

Alternatives and similar repositories for TVM.CMakeExtend

Users that are interested in TVM.CMakeExtend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tile-ai / tvm
View on GitHub
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆20Updated this week
mlc-ai / mlc-python
View on GitHub
☆36Jul 19, 2025Updated last year
tsinghua-ideal / Syno
View on GitHub
Source code repository for ASPLOS '25 paper "Syno: Structured Synthesis for Neural Operators"
☆15Aug 31, 2025Updated 10 months ago
plaidml / mlir-generator
View on GitHub
Generator for MLIR files from known front-ends
☆17Oct 31, 2023Updated 2 years ago
LeiWang1999 / Stream-k.tvm
View on GitHub
☆20Sep 28, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated last year
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
KuangjuX / cu-x
View on GitHub
🎉My Collections of CUDA Kernels~
☆11Jun 25, 2024Updated 2 years ago
DD-DuDa / TensorRT-in-Action
View on GitHub
TensorRT-in-Action 是一个 GitHub 代码库，提供了使用 TensorRT 的代码示例，并有对应 Jupyter Notebook。
☆15Jun 1, 2023Updated 3 years ago
sablin39 / tilelang-cuda-skills
View on GitHub
Skills for writing tilelang and debugging with CUDA toolkits.
☆133May 20, 2026Updated 2 months ago
mlc-ai / tirx-kernels
View on GitHub
ML kernels and benchmarking infrastructure written in TIRx
☆70Updated this week
chips-compilers-mlsys-21 / chips-compilers-mlsys-21.github.io
View on GitHub
☆11Apr 5, 2021Updated 5 years ago
triple-mu / Stable-Diffusion-TensorRT
View on GitHub
Stable Diffusion in TensorRT 8.5+
☆15Mar 19, 2023Updated 3 years ago
nox-410 / tvm.tl
View on GitHub
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆52Jul 23, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
latentCall145 / channels-last-groupnorm
View on GitHub
A CUDA kernel for NHWC GroupNorm for PyTorch
☆23Nov 15, 2024Updated last year
InternLM / turbomind
View on GitHub
☆97Mar 26, 2025Updated last year
yifu-ding / BGEMM-CUDA
View on GitHub
BGEMM-CUDA is a CUDA-based low-bit GEMM kernel library for efficient neural network inference. It implements optimized binary and ternary…
☆20Aug 30, 2024Updated last year
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆176Jun 16, 2026Updated last month
TiledTensor / TiledCUDA
View on GitHub
We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …
☆192Jan 28, 2025Updated last year
zhuzilin / pytorch-malloc
View on GitHub
An external memory allocator example for PyTorch.
☆16Aug 10, 2025Updated 11 months ago
microsoft / cusync
View on GitHub
☆27Feb 20, 2024Updated 2 years ago
Bruce-Lee-LY / cutlass_gemm
View on GitHub
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Aug 3, 2025Updated 11 months ago
microsoft / BitBLAS
View on GitHub
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
☆769Aug 6, 2025Updated 11 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
megvii-research / IntLLaMA
View on GitHub
IntLLaMA: A fast and light quantization solution for LLaMA
☆19Jul 21, 2023Updated 3 years ago
toyaix / triton-runner
View on GitHub
Multi-Level Triton Runner supporting Python, IR, PTX, AMDGCN, cubin and hasco.
☆98May 8, 2026Updated 2 months ago
tlc-pack / libflash_attn
View on GitHub
Standalone Flash Attention v2 kernel without libtorch dependency
☆113Sep 10, 2024Updated last year
apache / tvm-rfcs
View on GitHub
A home for the final text of all TVM RFCs.
☆111Sep 24, 2024Updated last year
microsoft / TileFusion
View on GitHub
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆115Jun 28, 2025Updated last year
tile-ai / tilelang-puzzles
View on GitHub
Learning TileLang with 10 puzzles!
☆349May 28, 2026Updated last month
mlc-ai / relax
View on GitHub
☆175Updated this week
tile-ai / tilelang-benchmark
View on GitHub
☆22Jun 10, 2026Updated last month
feifeibear / ChituAttention
View on GitHub
Quantized Attention on GPU
☆45Nov 22, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
OpenBitSys / BitDecoding
View on GitHub
[HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.
☆96May 14, 2026Updated 2 months ago
sjtu-epcc / Tacker
View on GitHub
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS
☆33Feb 10, 2025Updated last year
HPMLL / SpInfer_EuroSys25
View on GitHub
☆35Apr 2, 2025Updated last year
sophgo / libsophon
View on GitHub
Sophgo AI chips driver and runtime library.
☆26Updated this week
Archermmt / tvm_walk_through
View on GitHub
code reading for tvm
☆75Jan 20, 2022Updated 4 years ago
apache / tvm-ffi
View on GitHub
Open ABI and FFI for Machine Learning Systems
☆436Updated this week
tanzelin430 / libsmctrl
View on GitHub
libsmctrl论文的复现，添加了python端接口，可以在python端灵活调用接口来分配计算资源
☆12May 21, 2024Updated 2 years ago