Yulv-git / Model-Inference-DeploymentLinks
A curated list of awesome inference deployment framework of artificial intelligence (AI) models. OpenVINO, TensorRT, MediaPipe, TensorFlow Lite, TensorFlow Serving, ONNX Runtime, LibTorch, NCNN, TNN, MNN, TVM, MACE, Paddle Lite, MegEngine Lite, OpenPPL, Bolt, ExecuTorch.
☆62Updated last year
Alternatives and similar repositories for Model-Inference-Deployment
Users that are interested in Model-Inference-Deployment are comparing it to the libraries listed below
Sorting:
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆42Updated last year
- A tool convert TensorRT engine/plan to a fake onnx☆39Updated 2 years ago
- A simple tool that can generate TensorRT plugin code quickly.☆232Updated last year
- An onnx-based quantitation tool.☆71Updated last year
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆200Updated last year
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆49Updated last year
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆48Updated last year
- Offline Quantization Tools for Deploy.☆129Updated last year
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆61Updated last month
- Serving Inside Pytorch☆160Updated 2 weeks ago
- 高效部署:YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀☆49Updated 2 years ago
- ☆99Updated 3 years ago
- PyTorch Quantization Aware Training Example☆136Updated last year
- YOLOv5 on Orin DLA☆204Updated last year
- ☆66Updated 2 years ago
- ☆128Updated last year
- A Python and C++ library for model encryption and decryption, built on Crypto++, with support for various deep learning frameworks includ…☆46Updated last year
- A Toolkit to Help Optimize Large Onnx Model☆157Updated last year
- Count number of parameters / MACs / FLOPS for ONNX models.☆93Updated 8 months ago
- 该代码与B站上的视频 https://www.bilibili.com/video/BV18L41197Uz/?spm_id_from=333.788&vd_source=eefa4b6e337f16d87d87c2c357db8ca7 相关联。☆69Updated last year
- ☆47Updated 2 years ago
- CUDA 6大并行计算模式 代码与笔记☆61Updated 4 years ago
- A breakdown of NCNN☆46Updated 4 years ago
- ☆120Updated 2 years ago
- A large number of cuda/tensorrt cases . 大量案例来学习cuda/tensorrt☆135Updated 2 years ago
- TensorRT 7 C++ (almost) minimal examples☆81Updated last year
- Based of paper "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"☆64Updated 4 years ago
- Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"☆87Updated 5 months ago
- This is 8-bit quantization sample for yolov5. Both PTQ, QAT and Partial Quantization have been implemented, and present the results based…☆102Updated 2 years ago
- MegEngine到其他框架的转换器☆70Updated 2 years ago