openvinotoolkit / nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference
☆943Updated this week
Related projects ⓘ
Alternatives and complementary repositories for nncf
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,227Updated this week
- Actively maintained ONNX Optimizer☆647Updated 8 months ago
- A parser, editor and profiler tool for ONNX models.☆400Updated this week
- Model Quantization Benchmark☆765Updated 5 months ago
- TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillati…☆567Updated this week
- PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT☆2,597Updated this week
- Common utilities for ONNX converters☆251Updated 5 months ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆413Updated last year
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆328Updated this week
- Deploy your model with TensorRT quickly.☆762Updated 11 months ago
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆338Updated this week
- ONNX-TensorRT: TensorRT backend for ONNX☆2,953Updated 2 weeks ago
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆433Updated last week
- ⚡ Useful scripts when using TensorRT☆240Updated 4 years ago
- A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.☆336Updated 3 months ago
- Transform ONNX model to PyTorch representation☆318Updated last week
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆491Updated last month
- A general and accurate MACs / FLOPs profiler for PyTorch models☆571Updated 6 months ago
- [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment☆1,884Updated 11 months ago
- TensorFlow/TensorRT integration☆736Updated 11 months ago
- TensorRT Plugin Autogen Tool☆367Updated last year
- Mobile vision models and code☆902Updated 4 months ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆286Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆409Updated this week
- A scalable inference server for models optimized with OpenVINO™☆675Updated this week
- PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.☆258Updated last year
- Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™☆1,143Updated this week
- OpenVINO™ Explainable AI (XAI) Toolkit: Visual Explanation for OpenVINO Models☆28Updated last month
- ☆302Updated 11 months ago
- Accelerate PyTorch models with ONNX Runtime☆356Updated 2 months ago