yanghaku / tvm-rt-wasmLinks

A High performance and tiny TVM graph executor library written in C which can compile to WebAssembly and use CUDA/WebGPU as the accelerator.

☆12

Alternatives and similar repositories for tvm-rt-wasm

Users that are interested in tvm-rt-wasm are comparing it to the libraries listed below

Sorting:

mlc-ai / relax
☆161Updated 2 weeks ago
prajna-lang / prajna
a simple general program language
☆97Updated this week
gty111 / PTX-EMU
PTX-EMU is a simple emulator for CUDA program.
☆34Updated 3 months ago
iree-org / iree-experimental
Experiments and prototypes associated with IREE or MLIR
☆54Updated last year
AyakaGEMM / Hands-on-MLIR
☆17Updated last year
IBM / onnx-mlir-serving
ONNX Serving is a project written with C++ to serve onnx-mlir compiled models with GRPC and other protocols.Benefiting from C++ implement…
☆24Updated 3 months ago
code-nowww / 2020-USTC-Compiler-Lab-MLIR
Here is a final lab of Compiler in USTC, focusing on MLIR
☆18Updated 4 years ago
YangLinzhuo / cuda-sgemm-optimization
CUDA SGEMM optimization note
☆13Updated last year
microsoft / ark
A GPU-driven system framework for scalable AI applications
☆117Updated 6 months ago
wzh99 / relay-mlir
An MLIR-based toy DL compiler for TVM Relay.
☆58Updated 2 years ago
tlc-pack / libflash_attn
Standalone Flash Attention v2 kernel without libtorch dependency
☆111Updated 10 months ago
DeepLink-org / ditorch
☆23Updated 7 months ago
tfruan2000 / mlsys-study-note
My study note for mlsys
☆15Updated 9 months ago
FlagOpen / FlagCX
☆82Updated this week
QianyanTech / NBAssembler
Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.
☆82Updated 2 years ago
Lin-Mao / DrGPUM
A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
☆25Updated 9 months ago
MLIR-China / mlir-playground
Play with MLIR right in your browser
☆135Updated 2 years ago
roastduck / FreeTensor
A language and compiler for irregular tensor programs.
☆149Updated 8 months ago
Yongqi-Zhuo / triton-tvm
Triton to TVM transpiler.
☆21Updated 9 months ago
FlagTree / flagtree
FlagTree is a unified compiler for multiple AI chips, which is forked from triton-lang/triton.
☆67Updated this week
pytorch-labs / triton-cpu
An experimental CPU backend for Triton (https//github.com/openai/triton)
☆43Updated 4 months ago
TiledTensor / TiledCUDA
We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …
☆183Updated 6 months ago
microsoft / TileFusion
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆93Updated last month
microsoft / FractalTensor
FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …
☆28Updated 7 months ago
InfiniTensor / RefactorGraph
分层解耦的深度学习推理引擎
☆75Updated 5 months ago
Multi-V-VM / hetGPU
PTX on XPUs
☆48Updated this week
l1nkr / DL-Compiler-Navigation
Machine Learning Compiler Road Map
☆43Updated last year
bytedance / byteir
A model compilation solution for various hardware
☆439Updated 2 weeks ago
InfiniTensor / InfiniTensor
☆246Updated last week
bytedance / ByteMLPerf
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…
☆256Updated this week