leimao / PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration
TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models
☆14Updated 5 months ago
Alternatives and similar repositories for PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration:
Users that are interested in PyTorch-Eager-Mode-Quantization-TensorRT-Acceleration are comparing it to the libraries listed below
- A faster implementation of OpenCV-CUDA that uses OpenCV objects, and more!☆47Updated this week
- Efficient CUDA kernels for training convolutional neural networks with PyTorch.☆38Updated last month
- Converting weights of Pytorch models to ONNX & TensorRT engines☆47Updated last year
- Model compression for ONNX☆81Updated 2 months ago
- A tool convert TensorRT engine/plan to a fake onnx☆37Updated 2 years ago
- The Triton backend for TensorRT.☆68Updated this week
- Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> O…☆32Updated 3 years ago
- Simplify Your Visual Data Ops. Find and visualize issues with your computer vision datasets such as duplicates, anomalies, data leakage, …☆67Updated last year
- ☆31Updated 7 months ago
- A block oriented training approach for inference time optimization.☆32Updated 5 months ago
- ☆30Updated 2 years ago
- Timm model explorer☆36Updated 9 months ago
- ☆9Updated 2 years ago
- Simple example of FastAPI + Celery + Triton for benchmarking☆63Updated 2 years ago
- A very simple tool that compresses the overall size of the ONNX model by aggregating duplicate constant values as much as possible.☆52Updated 2 years ago
- New operators for the ReferenceEvaluator, new kernels for onnxruntime, CPU, CUDA☆31Updated 3 months ago
- Simple and easy stable diffusion inference with LightningModule on GPU, CPU and MPS (Possibly all devices supported by Lightning).☆17Updated last year
- Torchserve + TensorRT + Detection☆18Updated 2 years ago
- Page for the CVPR 2023 Tutorial - Efficient Neural Networks: From Algorithm Design to Practical Mobile Deployments☆12Updated last year
- Article about deploying machine learning models using grpc, pytorch and asyncio☆27Updated 2 years ago
- ☆31Updated last year
- Mixed precision training from scratch with Tensors and CUDA☆21Updated 8 months ago
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆84Updated last year
- HunyuanDiT with TensorRT and libtorch☆17Updated 7 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆75Updated this week
- Nsight Compute in Docker☆11Updated last year
- Torch Distributed Experimental☆115Updated 5 months ago
- Plugin for deploying MLflow models to TorchServe☆107Updated last year
- ☆20Updated 2 years ago