lucasjinreal / AI-Infer-Engine-From-ZeroLinks

关于自建AI推理引擎的手册，从0开始你需要知道的所有事情

☆267

Alternatives and similar repositories for AI-Infer-Engine-From-Zero

Users that are interested in AI-Infer-Engine-From-Zero are comparing it to the libraries listed below

Sorting:

MegEngine / MegCC
MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器
☆486Updated 9 months ago
Oldpan / DeployIsAllYouNeed
☆121Updated 2 years ago
torchpipe / torchpipe
Serving Inside Pytorch
☆163Updated this week
openmlsys / openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
☆129Updated last year
BBuf / how-to-learn-deep-learning-framework
how to learn PyTorch and OneFlow
☆445Updated last year
Eddie-Wang1120 / Professional-CUDA-C-Programming-Code-and-Notes
CUDA C 编程权威指南代码实现包含了书上第二章到第八章的大部分代码实现和作者笔记，全由作者本人手动实现，难免有错误的地方，请大家谨慎参考，非常欢迎对错误的指正。如果有帮助的话请Star一下，对作者帮助很大，谢谢！
☆350Updated 2 years ago
mlc-ai / mlc-zh
☆611Updated last year
zjhellofss / KuiperCourse
b站上的课程
☆75Updated last year
BBuf / how-to-optimize-gemm
☆97Updated 4 years ago
scarsty / ncnn-editor
ncnn和pnnx格式编辑器
☆136Updated 10 months ago
starmee / AI-Notes
My learning notes about AI, including Machine Learning and Deep Learning.
☆18Updated 6 years ago
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
dingyuqing05 / trt2022_wenet
☆72Updated 2 years ago
BBuf / onnx_learn
☆99Updated 4 years ago
harleyszhang / lite_llama
A light llama-like llm inference framework based on the triton kernel.
☆144Updated last week
InfiniTensor / RefactorGraph
分层解耦的深度学习推理引擎
☆74Updated 5 months ago
TRT2022 / MST-plus-plus-TensorRT
TensorRT 2022复赛方案：首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化
☆140Updated 3 years ago
nndeploy / nndeploy
Workflow-based Multi-platform AI Deployment Tool
☆1,123Updated this week
Oneflow-Inc / oneflow_convert
OneFlow->ONNX
☆43Updated 2 years ago
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆61Updated 8 months ago
Tencent / TPAT
TensorRT Plugin Autogen Tool
☆369Updated 2 years ago
OpenPPL / ppl.nn.llm
☆139Updated last year
OpenPPL / ppl.llm.serving
☆128Updated 7 months ago
bytedance / ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆474Updated last year
caiwanxianhust / FasterLLaMA
使用 CUDA C++ 实现的 llama 模型推理框架
☆58Updated 9 months ago
frankwang0818 / AI_compiler_development_guide
Free resource for the book AI Compiler Development Guide
☆46Updated 2 years ago
tpoisonooo / how-to-optimize-gemm
row-major matmul optimization
☆649Updated last year
netease-youdao / EMLL
Edge Machine Learning Library
☆196Updated 2 years ago
OpenPPL / ppl.kernel.cuda
☆37Updated 9 months ago
MegEngine / mperf
mperf是一个面向移动/嵌入式平台的算子性能调优工具箱
☆188Updated last year