tpoisonooo / chgemmLinks

symmetric int8 gemm

☆67

Alternatives and similar repositories for chgemm

Users that are interested in chgemm are comparing it to the libraries listed below

Sorting:

carlushuang / cpu_gemm_opt
how to design cpu gemm on x86 with avx256, that can beat openblas.
☆71Updated 6 years ago
BBuf / how-to-optimize-gemm
☆98Updated 4 years ago
AI-performance / embedded-ai.bench
benchmark for embededded-ai deep learning inference engines, such as NCNN / TNN / MNN / TensorFlow Lite etc.
☆204Updated 4 years ago
OpenPPL / ppl.common
Common libraries for PPL projects
☆29Updated 7 months ago
BBuf / ArmNeonOptimization
arm-neon
☆92Updated last year
FrozenGene / tvm-tutorial
TVM tutorial
☆66Updated 6 years ago
ChenShisen / ncnnqat
quantize aware training package for NCNN on pytorch
☆69Updated 4 years ago
lyuchuny3 / Tengine_gemm_tutorial
Tengine gemm tutorial, step by step
☆13Updated 4 years ago
XiaoMi / nnlib
Fork of https://source.codeaurora.org/quic/hexagon_nn/nnlib
☆58Updated 2 years ago
whitelok / tvm-lesson
动手学习TVM核心原理教程
☆63Updated 4 years ago
atanmarko / ncnn-with-cuda
Tencent NCNN with added CUDA support
☆70Updated 4 years ago
BBuf / Memory-efficient-Convolution-for-Deep-Neural-Network
☆21Updated 5 years ago
starmee / AI-Notes
My learning notes about AI, including Machine Learning and Deep Learning.
☆18Updated 6 years ago
alibaba / heterogeneity-aware-lowering-and-optimization
heterogeneity-aware-lowering-and-optimization
☆256Updated last year
OAID / Tengine-Convert-Tools
Tengine Convert Tool supports converting multi framworks' models into tmfile that suitable for Tengine-Lite AI framework.
☆92Updated 4 years ago
Forwil / tvmt_v2
☆10Updated 5 years ago
vinx13 / tvm-cuda-int8-benchmark
Benchmark of TVM quantized model on CUDA
☆111Updated 5 years ago
OpenPPL / ppl.kernel.cuda
☆37Updated last year
MegEngine / mperf
mperf是一个面向移动/嵌入式平台的算子性能调优工具箱
☆190Updated 2 years ago
pigirons / conv3x3_m1
This is a demo how to write a high performance convolution run on apple silicon
☆56Updated 3 years ago
OpenPPL / ppl.kernel.cpu
☆18Updated last year
merrymercy / tvm-mali
Optimizing Mobile Deep Learning on ARM GPU with TVM
☆181Updated 7 years ago
ModelTC / pyvlova
Yet another Polyhedra Compiler for DeepLearning
☆19Updated 2 years ago
xuqiantong / CUDA-Winograd
Fast CUDA Kernels for ResNet Inference.
☆180Updated 6 years ago
MegEngine / mgeconvert
MegEngine到其他框架的转换器
☆70Updated 2 years ago
Oneflow-Inc / oneflow_convert
OneFlow->ONNX
☆43Updated 2 years ago
Oneflow-Inc / oneflow-xrt
☆23Updated 2 years ago
tlc-pack / tophub
tophub autotvm log collections
☆69Updated 2 years ago
BUG1989 / ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
☆14Updated 3 years ago
XiuYuLi / flexible-gemm
flexible-gemm conv of deepcore
☆17Updated 5 years ago