AlibabaResearch/mononn

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AlibabaResearch/mononn)

AlibabaResearch / mononn

☆33

Alternatives and similar repositories for mononn

Users that are interested in mononn are comparing it to the libraries listed below

Sorting:

AlibabaResearch / recom
View on GitHub
An Optimizing Compiler for Recommendation Model Inference
☆26Jun 5, 2025Updated 8 months ago
PanZaifeng / RecFlex
View on GitHub
A recommendation model kernel optimizing system
☆12Jun 5, 2025Updated 8 months ago
monellz / FlashTensor
View on GitHub
☆18Mar 4, 2025Updated 11 months ago
tanzelin430 / libsmctrl
View on GitHub
libsmctrl论文的复现，添加了python端接口，可以在python端灵活调用接口来分配计算资源
☆12May 21, 2024Updated last year
aoli-al / HFuse
View on GitHub
Horizontal Fusion
☆24Jan 7, 2022Updated 4 years ago
TiledTensor / TiledLower
View on GitHub
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆14Nov 23, 2024Updated last year
tile-ai / tvm
View on GitHub
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆19Feb 24, 2026Updated last week
summerspringwei / souffle-ae
View on GitHub
☆17Jan 24, 2024Updated 2 years ago
pku-liang / MAGIS
View on GitHub
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆56May 29, 2024Updated last year
KULeuven-COSIC / vFHE
View on GitHub
☆11Apr 2, 2024Updated last year
usyd-fsalab / ReadingList
View on GitHub
☆13Apr 27, 2022Updated 3 years ago
nox-410 / Welder
View on GitHub
OSDI 2023 Welder, deeplearning compiler
☆32Nov 24, 2023Updated 2 years ago
microsoft / cusync
View on GitHub
☆25Feb 20, 2024Updated 2 years ago
Cerebras / sdk-examples
View on GitHub
☆34May 23, 2025Updated 9 months ago
IBM / triton-dejavu
View on GitHub
Framework to reduce autotune overhead to zero for well known deployments.
☆97Sep 19, 2025Updated 5 months ago
illinois-impact / klap
View on GitHub
A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches
☆15Jun 21, 2019Updated 6 years ago
LeiWang1999 / tvm_gpu_gemm
View on GitHub
play gemm with tvm
☆92Jul 22, 2023Updated 2 years ago
uchuhimo / amanda
View on GitHub
☆18Apr 21, 2024Updated last year
tfruan2000 / mlsys-study-note
View on GitHub
My study note for mlsys
☆14Nov 4, 2024Updated last year
adsl-rg / adsl-rg.github.io
View on GitHub
☆14Jan 28, 2026Updated last month
humuyan / Korch
View on GitHub
ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch
☆39Mar 27, 2025Updated 11 months ago
LeiWang1999 / Stream-k.tvm
View on GitHub
☆20Sep 28, 2024Updated last year
ZhengChenCS / CompressGraph
View on GitHub
☆21Oct 21, 2024Updated last year
microsoft / FractalTensor
View on GitHub
FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …
☆32Dec 21, 2024Updated last year
PanZaifeng / FastTree-Artifact
View on GitHub
☆27Mar 24, 2025Updated 11 months ago
Linestro / GRACE
View on GitHub
Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
☆19Mar 5, 2023Updated 2 years ago
facebookexperimental / triton
View on GitHub
Github mirror of trition-lang/triton repo.
☆146Updated this week
microsoft / triton-shared
View on GitHub
Shared Middle-Layer for Triton Compilation
☆329Dec 5, 2025Updated 2 months ago
microsoft / TileFusion
View on GitHub
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆106Jun 28, 2025Updated 8 months ago
KuangjuX / Paper-reading
View on GitHub
My Paper Reading Lists and Notes.
☆21Feb 17, 2026Updated 2 weeks ago
toyaix / triton-runner
View on GitHub
Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.
☆84Updated this week
ColfaxResearch / cfx-article-src
View on GitHub
☆178May 7, 2025Updated 9 months ago
nox-410 / tvm.tl
View on GitHub
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆50Jul 23, 2024Updated last year
AlibabaResearch / flash-llm
View on GitHub
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
☆234Sep 24, 2023Updated 2 years ago
KnowingNothing / MatmulTutorial
View on GitHub
A Easy-to-understand TensorOp Matmul Tutorial
☆410Feb 11, 2026Updated 3 weeks ago
triton-lang / Triton-to-tile-IR
View on GitHub
incubator repo for CUDA-TileIR backend
☆109Feb 14, 2026Updated 2 weeks ago
microsoft / vattention
View on GitHub
Dynamic Memory Management for Serving LLMs without PagedAttention
☆464May 30, 2025Updated 9 months ago
uiuc-arc / felix
View on GitHub
Optimize tensor program fast with Felix, a gradient descent autotuner.
☆32Apr 27, 2024Updated last year
hgyhungry / alcop-artifact
View on GitHub
☆24Mar 15, 2023Updated 2 years ago