Yulv-git / Model-Inference-DeploymentLinks
A curated list of awesome inference deployment framework of artificial intelligence (AI) models. OpenVINO, TensorRT, MediaPipe, TensorFlow Lite, TensorFlow Serving, ONNX Runtime, LibTorch, NCNN, TNN, MNN, TVM, MACE, Paddle Lite, MegEngine Lite, OpenPPL, Bolt, ExecuTorch.
☆64Updated last year
Alternatives and similar repositories for Model-Inference-Deployment
Users that are interested in Model-Inference-Deployment are comparing it to the libraries listed below
Sorting:
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆42Updated last year
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆49Updated last year
- ☆121Updated 2 years ago
- A Toolkit to Help Optimize Large Onnx Model☆157Updated last year
- Serving Inside Pytorch☆163Updated last week
- An onnx-based quantitation tool.☆71Updated last year
- A large number of cuda/tensorrt cases . 大量案例来学习cuda/tensorrt☆136Updated 2 years ago
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆69Updated last year
- A simple tool that can generate TensorRT plugin code quickly.☆232Updated 2 years ago
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆204Updated last year
- ☆128Updated last year
- A Toolkit to Help Optimize Onnx Model☆174Updated last week
- PyTorch Quantization Aware Training Example☆137Updated last year
- NVIDIA-阿里2021 TRT比赛 `二等奖` 代码提交 团队:美迪康 AI Lab☆171Updated 2 years ago
- This is 8-bit quantization sample for yolov5. Both PTQ, QAT and Partial Quantization have been implemented, and present the results based…☆102Updated 2 years ago
- ☆99Updated 3 years ago
- ☆78Updated 2 years ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆49Updated last year
- A simple tutorial of SNPE.☆175Updated 2 years ago
- pytorch AutoSlim tools,支持三行代码对pytorch模型进行剪枝压缩☆39Updated 4 years ago
- 高效部署:YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推 理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀☆49Updated 2 years ago
- Large Language Model Onnx Inference Framework☆36Updated 6 months ago
- TensorRT 2022复赛方案: 首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化☆139Updated 3 years ago
- YOLOv5 on Orin DLA☆205Updated last year
- Offline Quantization Tools for Deploy.☆129Updated last year
- Using pattern matcher in onnx model to match and replace subgraphs.☆81Updated last year
- ☆142Updated last year
- autoTVM神经网络推理代码优化搜索演示,基于tvm编译开源模型centerface,并使用autoTVM搜索最优推理代码, 最终部署编译为c++代码,演示平台是cuda,可以是其他平台,例如树莓派,安卓手机,苹果手机.Thi is a demonstration of …☆27Updated 4 years ago
- Utility scripts for editing or modifying onnx models. Utility scripts to summarize onnx model files along with visualization for loop ope…☆80Updated 3 years ago
- TensorRT 2022 亚军方案,tensorrt加速mobilevit模型☆68Updated 3 years ago