sophgo / tpu-mlirLinks
Machine learning compiler based on MLIR for Sophgo TPU.
☆737Updated last week
Alternatives and similar repositories for tpu-mlir
Users that are interested in tpu-mlir are comparing it to the libraries listed below
Sorting:
- compiler learning resources collect.☆2,423Updated 3 months ago
- An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).☆599Updated last week
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆485Updated 7 months ago
- Hands-On Practical MLIR Tutorial☆511Updated last year
- how to learn PyTorch and OneFlow☆435Updated last year
- Development repository for the Triton-Linalg conversion☆188Updated 4 months ago
- BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.☆873Updated 5 months ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,697Updated last year
- row-major matmul optimization☆637Updated last year
- A model compilation solution for various hardware☆437Updated this week
- FlagGems is an operator library for large language models implemented in the Triton Language.☆573Updated this week
- ppl.cv is a high-performance image processing library of openPPL supporting various platforms.☆504Updated 7 months ago
- A primitive library for neural network☆1,345Updated 6 months ago
- A CUDA tutorial to make people learn CUDA program from 0☆234Updated 11 months ago
- Easy-to-use, high-performance, multi-platform inference deployment framework☆984Updated this week
- Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure☆873Updated last week
- Run generative AI models in sophgo BM1684X/BM1688☆220Updated this week
- ☆609Updated last year
- CUDA 算子手撕与面试指南☆426Updated 5 months ago
- how to optimize some algorithm in cuda.☆2,269Updated this week
- learning how CUDA works☆269Updated 3 months ago
- A simple high performance CUDA GEMM implementation.☆380Updated last year
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆123Updated this week
- Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…☆421Updated 9 months ago
- ☆283Updated 3 years ago
- PyTorch Neural Network eXchange☆594Updated last week
- A parser, editor and profiler tool for ONNX models.☆442Updated 2 weeks ago
- ☆278Updated 8 months ago
- Yinghan's Code Sample☆330Updated 2 years ago
- ☆372Updated this week