leimao / PyTorch-Eager-Mode-Quantization-TensorRT-AccelerationLinks
TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models
☆17Updated last year
Alternatives and similar repositories for PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration
Users that are interested in PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration are comparing it to the libraries listed below
Sorting:
- Model compression for ONNX☆99Updated last year
- Spio (SPEE-oh) - Experimental CUDA kernel framework unifying typed dimensions, NVRTC JIT specialization, and ML‑guided tuning.☆46Updated this week
- Converting weights of Pytorch models to ONNX & TensorRT engines☆50Updated 2 years ago
- A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!☆54Updated last month
- PyTorch Pruning Example☆50Updated 3 years ago
- The Triton backend for the ONNX Runtime.☆170Updated this week
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆109Updated 2 years ago
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆87Updated this week
- The Triton backend for TensorRT.☆82Updated last month
- A tool convert TensorRT engine/plan to a fake onnx☆41Updated 3 years ago
- ☆208Updated 4 years ago
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆428Updated this week
- A set of simple tools for splitting, merging, OP deletion, size compression, rewriting attributes and constants, OP generation, change op…☆303Updated last year
- ☆34Updated 6 months ago
- Profile PyTorch models for FLOPs and parameters, helping to evaluate computational efficiency and memory usage.☆108Updated last week
- Count number of parameters / MACs / FLOPS for ONNX models.☆95Updated last year
- PyTorch Quantization Aware Training Example☆149Updated last year
- ☆70Updated 3 years ago
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆85Updated this week
- Implementation of YOLOv9 QAT optimized for deployment on TensorRT platforms.☆129Updated 8 months ago
- Inference of quantization aware trained networks using TensorRT☆83Updated 2 years ago
- Nsight Systems In Docker☆20Updated 2 years ago
- EfficientViT is a new family of vision models for efficient high-resolution vision.☆30Updated 2 years ago
- Zero-label image classification via OpenCLIP knowledge distillation☆141Updated 2 years ago
- The Triton backend for the PyTorch TorchScript models.☆170Updated this week
- ☆178Updated last year
- Generalist YOLO: Towards Real-Time End-to-End Multi-Task Visual Language Models☆86Updated 8 months ago
- A Toolkit to Help Optimize Onnx Model☆300Updated this week
- Timm model explorer☆42Updated last year
- Common utilities for ONNX converters☆290Updated 3 weeks ago