InfiniTensor/InfiniTensor

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/InfiniTensor/InfiniTensor)

InfiniTensor / InfiniTensor

InfiniTensor is a high-performance inference engine tailored for GPUs and AI accelerators. Its design focuses on effective deployment and swift academic validation.

☆372

Alternatives and similar repositories for InfiniTensor

Users that are interested in InfiniTensor are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InfiniTensor / InfiniCore
View on GitHub
☆67Updated this week
InfiniTensor / operators
View on GitHub
算子库
☆17Jul 9, 2025Updated last year
thu-pacman / PET
View on GitHub
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆126Jun 23, 2022Updated 4 years ago
humuyan / Korch
View on GitHub
ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch
☆41Mar 27, 2025Updated last year
roastduck / FreeTensor
View on GitHub
A language and compiler for irregular tensor programs.
☆152Updated this week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
LearningInfiniTensor / TinyInfiniTensor
View on GitHub
☆44Jan 8, 2025Updated last year
LeiWang1999 / Stream-k.tvm
View on GitHub
☆20Sep 28, 2024Updated last year
InfiniTensor / InfiniTrain
View on GitHub
☆44Updated this week
InfiniTensor / InfiniLM
View on GitHub
☆184Updated this week
KnowingNothing / MatmulTutorial
View on GitHub
A Easy-to-understand TensorOp Matmul Tutorial
☆445Mar 5, 2026Updated 4 months ago
uwsampl / SparseTIR
View on GitHub
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆145Mar 31, 2023Updated 3 years ago
InfiniTensor / InfiniCCL
View on GitHub
InfiniCCL is a unified, cross-platform collective communication library designed for heterogeneous accelerator environments.
☆15Updated this week
pku-liang / MAGIS
View on GitHub
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆57May 29, 2024Updated 2 years ago
TiledTensor / TiledKernel
View on GitHub
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19May 12, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Cjkkkk / CUDA_gemm
View on GitHub
A simple high performance CUDA GEMM implementation.
☆437Jan 4, 2024Updated 2 years ago
nox-410 / tvm.tl
View on GitHub
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆52Jul 23, 2024Updated last year
tlc-pack / cutlass_fpA_intB_gemm
View on GitHub
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
☆96Jun 21, 2026Updated 3 weeks ago
mlc-ai / mlc-python
View on GitHub
☆36Jul 19, 2025Updated last year
Liu-xiandong / How_to_optimize_in_GPU
View on GitHub
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,329Jul 29, 2023Updated 2 years ago
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
InfiniTensor / TinyInfiniTrain
View on GitHub
训练营训练方向项目
☆28Jan 28, 2026Updated 5 months ago
mirage-project / mirage
View on GitHub
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
☆2,376Updated this week
JohndeVostok / APE
View on GitHub
A GPU FP32 computation method with Tensor Cores.
☆27Dec 8, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
BBuf / tvm_mlir_learn
View on GitHub
compiler learning resources collect.
☆2,759May 20, 2026Updated 2 months ago
tlc-pack / relax
View on GitHub
☆193Mar 28, 2023Updated 3 years ago
tsinghua-ideal / Syno
View on GitHub
Source code repository for ASPLOS '25 paper "Syno: Structured Synthesis for Neural Operators"
☆15Aug 31, 2025Updated 10 months ago
pku-liang / AMOS
View on GitHub
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
☆125Oct 26, 2022Updated 3 years ago
jiazhihao / TASO
View on GitHub
The Tensor Algebra SuperOptimizer for Deep Learning
☆742Jan 26, 2023Updated 3 years ago
KnowingNothing / compiler-and-arch
View on GitHub
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
☆531Jan 15, 2025Updated last year
jiazhihao / attention_superoptimizer
View on GitHub
An Attention Superoptimizer
☆22Jan 20, 2025Updated last year
TiledTensor / TiledCUDA
View on GitHub
We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …
☆192Jan 28, 2025Updated last year
buddy-compiler / buddy-mlir
View on GitHub
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
☆742Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hidet-org / hidet
View on GitHub
An open-source efficient deep learning framework/compiler, written in python.
☆743Sep 4, 2025Updated 10 months ago
NVIDIA / tilus
View on GitHub
Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
☆489Jul 5, 2026Updated 2 weeks ago
apache / tvm-ffi
View on GitHub
Open ABI and FFI for Machine Learning Systems
☆434Updated this week
microsoft / FractalTensor
View on GitHub
FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …
☆32Dec 21, 2024Updated last year
bytedance / byteir
View on GitHub
A model compilation solution for various hardware
☆473Aug 20, 2025Updated 11 months ago
MegEngine / MegCC
View on GitHub
MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器
☆482Oct 23, 2024Updated last year
LearningInfiniTensor / learning-lm-rs
View on GitHub
☆69Nov 21, 2024Updated last year