leimao / PyTorch-Eager-Mode-Quantization-TensorRT-AccelerationLinks
TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models
☆17Updated last year
Alternatives and similar repositories for PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration
Users that are interested in PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration are comparing it to the libraries listed below
Sorting:
- Model compression for ONNX☆99Updated last year
- PyTorch Pruning Example☆50Updated 2 years ago
- A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!☆54Updated last week
- Converting weights of Pytorch models to ONNX & TensorRT engines☆50Updated 2 years ago
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆83Updated this week
- Experimental CUDA kernel framework unifying typed dimensions, NVRTC JIT specialization, and ML‑guided tuning.☆43Updated this week
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆78Updated this week
- Profile PyTorch models for FLOPs and parameters, helping to evaluate computational efficiency and memory usage.☆65Updated 2 weeks ago
- Timm model explorer☆42Updated last year
- ☆34Updated 5 months ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆298Updated last year
- The Triton backend for the PyTorch TorchScript models.☆164Updated last week
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆109Updated 2 years ago
- Nsight Systems In Docker☆20Updated last year
- A Toolkit to Help Optimize Large Onnx Model☆162Updated 3 weeks ago
- The Triton backend for TensorRT.☆79Updated last week
- A Toolkit to Help Optimize Onnx Model☆236Updated 2 weeks ago
- ☆178Updated last year
- ☆207Updated 4 years ago
- PyTorch Quantization Aware Training Example☆144Updated last year
- A block oriented training approach for inference time optimization.☆33Updated last year
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆244Updated last month
- A tutorial introducing knowledge distillation as an optimization technique for deployment on NVIDIA Jetson☆221Updated 2 years ago
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆59Updated 2 years ago
- Count number of parameters / MACs / FLOPS for ONNX models.☆95Updated last year
- A set of simple tools for splitting, merging, OP deletion, size compression, rewriting attributes and constants, OP generation, change op…☆300Updated last year
- ☆176Updated 2 years ago
- New operators for the ReferenceEvaluator, new kernels for onnxruntime, CPU, CUDA☆35Updated 2 weeks ago
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆425Updated this week
- A tool convert TensorRT engine/plan to a fake onnx☆41Updated 3 years ago