Arm-China / Compass_Apache_TVMLinks
Compass Apache TVM is enhanced based on the Apache TVM for wide range of Neural Network (NN) models quick support, optimization and heterogeneous execution.
☆21Updated last week
Alternatives and similar repositories for Compass_Apache_TVM
Users that are interested in Compass_Apache_TVM are comparing it to the libraries listed below
Sorting:
- An optimized neural network operator library for chips base on Xuantie CPU.☆96Updated last year
- ☆33Updated 2 years ago
- LLVM OpenCL C compiler suite for ventus GPGPU☆58Updated last month
- RISCV C and Triton AI-Benchmark☆23Updated last month
- Fork of upstream onnxruntime focused on supporting risc-v accelerators☆88Updated 2 years ago
- edge/mobile transformer based Vision DNN inference benchmark☆16Updated 5 months ago
- FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang…☆197Updated this week
- ☆122Updated this week
- AI-ML-NLP Task Group☆13Updated 2 years ago
- IREE plugin repository for the AMD AIE accelerator☆119Updated this week
- ☆13Updated 6 years ago
- armchina NPU parser☆41Updated last week
- CSV spreadsheets and other material for AI accelerator survey papers☆189Updated 2 months ago
- This project contains a code generator that produces static C NN inference deployment code targeting tiny micro-controllers (TinyML) as r…☆30Updated 4 years ago
- ☆68Updated 2 years ago
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆150Updated last week
- My study note for mlsys☆15Updated last year
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆96Updated 2 years ago
- OpenAI Triton backend for Intel® GPUs☆225Updated this week
- ☆48Updated 5 years ago
- muRISCV-NN is a collection of efficient deep learning kernels for embedded platforms and microcontrollers.☆90Updated 3 months ago
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆185Updated 3 weeks ago
- ☆164Updated this week
- VeriSilicon Tensor Interface Module☆246Updated last week
- Zhouyi model zoo☆108Updated 3 months ago
- ☆170Updated 2 years ago
- A Winograd Minimal Filter Implementation in CUDA☆28Updated 4 years ago
- ☆18Updated 2 weeks ago
- ☆29Updated 4 years ago
- a simple end to end example of taking a ML graph (TF2 / PyTorch) and running it on a device [cpu, gpu]☆36Updated 5 years ago