dmlc/nnvm-fusion

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dmlc/nnvm-fusion)

dmlc / nnvm-fusion

Kernel Fusion and Runtime Compilation Based on NNVM

☆72

Alternatives and similar repositories for nnvm-fusion

Users that are interested in nnvm-fusion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mohamed / roofline
View on GitHub
A simple script to plot the Roofline model for given HW platforms and applications
☆10Mar 17, 2026Updated 4 months ago
octoml / synr
View on GitHub
A library for syntactically rewriting Python programs, pronounced (sinner).
☆66Feb 22, 2022Updated 4 years ago
uwsampl / relay-aot
View on GitHub
An experimental ahead of time compiler for Relay.
☆49Apr 21, 2020Updated 6 years ago
uchuhimo / amanda
View on GitHub
☆18Apr 21, 2024Updated 2 years ago
dmlc / nnvm
View on GitHub
☆1,650Sep 11, 2018Updated 7 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
jiazhihao / TASO
View on GitHub
The Tensor Algebra SuperOptimizer for Deep Learning
☆742Jan 26, 2023Updated 3 years ago
mit-han-lab / inter-operator-scheduler
View on GitHub
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆201Apr 27, 2022Updated 4 years ago
dmlc / dlpack
View on GitHub
common in-memory tensor structure
☆1,233Jun 19, 2026Updated last month
szagoruyko / openai-gemm.pytorch
View on GitHub
PyTorch bindings for openai-gemm
☆20Feb 6, 2017Updated 9 years ago
tlc-pack / TLCBench
View on GitHub
Benchmark scripts for TVM
☆75Mar 15, 2022Updated 4 years ago
dmlc / HalideIR
View on GitHub
Symbolic Expression and Statement Module for new DSLs
☆207Oct 6, 2020Updated 5 years ago
strin / gemm-android
View on GitHub
tutorial to optimize GEMM performance on android
☆51Feb 17, 2016Updated 10 years ago
awslabs / lorien
View on GitHub
☆42Sep 8, 2023Updated 2 years ago
billmuch / matmul_perf_test
View on GitHub
☆15Apr 15, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tobiasgrosser / islplot
View on GitHub
Library to plot integer sets and maps
☆53Nov 27, 2016Updated 9 years ago
baidu-research / DeepBench
View on GitHub
Benchmarking Deep Learning operations on different hardware
☆1,104Apr 25, 2021Updated 5 years ago
jiazhihao / ROC
View on GitHub
Distributed Multi-GPU GNN Framework
☆36Jun 26, 2020Updated 6 years ago
uwsampl / SparseTIR
View on GitHub
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆145Mar 31, 2023Updated 3 years ago
LeiWang1999 / tvm_gpu_gemm
View on GitHub
play gemm with tvm
☆91Jul 22, 2023Updated 3 years ago
openai / openai-gemm
View on GitHub
Open single and half precision gemm implementations
☆396Apr 2, 2023Updated 3 years ago
TanDongXu / CUDA-MCDNN
View on GitHub
☆12Jul 13, 2017Updated 9 years ago
harvardnlp / lie-access-memory
View on GitHub
☆18Mar 5, 2017Updated 9 years ago
neulab / dynet-benchmark
View on GitHub
Benchmarks for DyNet
☆55Sep 22, 2025Updated 10 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yongwonshin / PIMFlow
View on GitHub
☆15Mar 10, 2024Updated 2 years ago
pytorch / tvm
View on GitHub
TVM integration into PyTorch
☆455Jan 15, 2020Updated 6 years ago
xiezhq-hermann / graphiler
View on GitHub
Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…
☆59Oct 3, 2022Updated 3 years ago
tqchen / tinyflow
View on GitHub
Tutorial code on how to build your own Deep Learning System in 2k Lines
☆2,017Oct 4, 2018Updated 7 years ago
NVIDIA / cub
View on GitHub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,840Oct 9, 2023Updated 2 years ago
shiyangdaisy23 / vqa-mxnet-gluon
View on GitHub
☆16Nov 21, 2017Updated 8 years ago
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,074Jan 3, 2023Updated 3 years ago
d2l-ai / d2l-tvm
View on GitHub
Dive into Deep Learning Compiler
☆649Jun 19, 2022Updated 4 years ago
tbd-ai / tbd-tools
View on GitHub
☆12May 3, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tensor-compiler / taco
View on GitHub
The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
☆1,364Apr 14, 2025Updated last year
davidBelanger / torch-util
View on GitHub
utility code for doing deep nlp in torch
☆17May 16, 2017Updated 9 years ago
VoVAllen / tf-dlpack
View on GitHub
DLPack for Tensorflow
☆34Apr 13, 2020Updated 6 years ago
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
masahi / tvm-cutlass-eval
View on GitHub
☆41Mar 31, 2022Updated 4 years ago
flame / fmm-gen
View on GitHub
Generating Families of Practical Fast Matrix Multiplication Algorithms
☆12Jul 7, 2017Updated 9 years ago
YukeWang96 / MGG_OSDI23
View on GitHub
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…
☆40Mar 17, 2024Updated 2 years ago