Tencent / TPAT
TensorRT Plugin Autogen Tool
☆369Updated last year
Alternatives and similar repositories for TPAT:
Users that are interested in TPAT are comparing it to the libraries listed below
- A simple tool that can generate TensorRT plugin code quickly.☆224Updated last year
- ppl.cv is a high-performance image processing library of openPPL supporting various platforms.☆495Updated 2 months ago
- A parser, editor and profiler tool for ONNX models.☆411Updated last week
- Deploy your model with TensorRT quickly.☆764Updated last year
- Offline Quantization Tools for Deploy.☆119Updated last year
- Yinghan's Code Sample☆300Updated 2 years ago
- TensorRT 2022复赛方案: 首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化☆137Updated 2 years ago
- ☆252Updated 2 years ago
- ☆140Updated 8 months ago
- Useful tensorrt plugin. For pytorch and mmdetection model conversion.☆161Updated 3 months ago
- Actively maintained ONNX Optimizer☆657Updated 10 months ago
- A sample for onnxparser working with trt user defined plugins for TRT7.0☆166Updated 4 years ago
- ☆127Updated 3 weeks ago
- ☆141Updated last week
- Common utilities for ONNX converters☆256Updated last month
- ⚡ Useful scripts when using TensorRT☆239Updated 4 years ago
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆476Updated 2 months ago
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆107Updated this week
- A simple high performance CUDA GEMM implementation.☆343Updated last year
- Serving Inside Pytorch☆150Updated 3 weeks ago
- Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…☆330Updated 4 months ago
- row-major matmul optimization☆599Updated last year
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆187Updated 7 months ago
- TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillati…☆667Updated last week
- Inference of quantization aware trained networks using TensorRT☆80Updated last year
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆467Updated 10 months ago
- EasyQuant(EQ) is an efficient and simple post-training quantization method via effectively optimizing the scales of weights and activatio…☆393Updated 2 years ago
- ☆105Updated 10 months ago
- code reading for tvm☆72Updated 2 years ago
- 服务侧深度学习部署案例☆451Updated 4 years ago