prajna-lang / prajna
a program language for AI infrastructure
☆85Updated this week
Related projects: ⓘ
- play gemm with tvm☆81Updated last year
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆74Updated last year
- ☆151Updated this week
- ☆19Updated this week
- Play with MLIR right in your browser☆122Updated last year
- ☆71Updated last year
- TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.☆114Updated last week
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆66Updated last year
- 分层解耦的深度学习推理引擎☆58Updated 3 weeks ago
- Compiler Infrastructure for Neural Networks☆142Updated last year
- An MLIR-based toy DL compiler for TVM Relay.☆53Updated last year
- ☆77Updated last year
- Standalone Flash Attention v2 kernel without libtorch dependency☆93Updated last week
- ☆133Updated 2 months ago
- A model compilation solution for various hardware☆357Updated this week
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆82Updated 6 months ago
- ☆92Updated 3 years ago
- Benchmark Framework for Buddy Projects☆45Updated 3 weeks ago
- A language and compiler for irregular tensor programs.☆132Updated 4 months ago
- A benchmark suited especially for deep learning operators☆40Updated last year
- code reading for tvm☆69Updated 2 years ago
- This is a demo how to write a high performance convolution run on apple silicon☆52Updated 2 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆112Updated 2 years ago
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆100Updated last week
- examples for tvm schedule API☆97Updated last year
- OneFlow->ONNX☆41Updated last year
- ☆15Updated 4 months ago
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆467Updated last month
- Machine Learning Compiler Road Map☆40Updated last year
- ☆56Updated this week