microsoft / nnfusionLinks

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

☆992

Alternatives and similar repositories for nnfusion

Users that are interested in nnfusion are comparing it to the libraries listed below

Sorting:

jiazhihao / TASO
The Tensor Algebra SuperOptimizer for Deep Learning
☆726Updated 2 years ago
alibaba / BladeDISC
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
☆886Updated 7 months ago
d2l-ai / d2l-tvm
Dive into Deep Learning Compiler
☆646Updated 3 years ago
onnx / onnx-mlir
Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
☆887Updated this week
tpoisonooo / how-to-optimize-gemm
row-major matmul optimization
☆649Updated last year
tlc-pack / relax
☆196Updated 2 years ago
pytorch / FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
☆1,415Updated this week
alibaba / heterogeneity-aware-lowering-and-optimization
heterogeneity-aware-lowering-and-optimization
☆255Updated last year
tensorflow / mlir-hlo
☆420Updated this week
pytorch / tvm
TVM integration into PyTorch
☆453Updated 5 years ago
Yinghan-Li / YHs_Sample
Yinghan's Code Sample
☆340Updated 3 years ago
bytedance / byteir
A model compilation solution for various hardware
☆439Updated last week
cloudcores / CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆523Updated 2 years ago
flame / blislab
BLISlab: A Sandbox for Optimizing GEMM
☆531Updated 4 years ago
OpenPPL / ppl.nn
A primitive library for neural network
☆1,345Updated 8 months ago
MegEngine / MegCC
MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器
☆486Updated 9 months ago
pigirons / cpufp
A CPU tool for benchmarking the peak of floating points
☆557Updated 3 weeks ago
Cambricon / triton-linalg
Development repository for the Triton-Linalg conversion
☆190Updated 5 months ago
dmlc / dlpack
common in-memory tensor structure
☆1,042Updated last month
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆200Updated 3 years ago
daadaada / turingas
Assembler for NVIDIA Volta and Turing GPUs
☆226Updated 3 years ago
yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
☆369Updated 6 months ago
Cjkkkk / CUDA_gemm
A simple high performance CUDA GEMM implementation.
☆392Updated last year
Oneflow-Inc / DLPerf
DeepLearning Framework Performance Profiling Toolkit
☆285Updated 3 years ago
msr-fiddle / pipedream
☆393Updated 2 years ago
buddy-compiler / buddy-mlir
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
☆611Updated this week
llvm / torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
☆1,591Updated last week
tqchen / ffi-navigator
☆241Updated this week
tensorflow / runtime
A performant and modular runtime for TensorFlow
☆758Updated 3 months ago
pigirons / sgemm_hsw
This is an implementation of sgemm_kernel on L1d cache.
☆229Updated last year