quic / aimetLinks
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
☆2,492Updated this week
Alternatives and similar repositories for aimet
Users that are interested in aimet are comparing it to the libraries listed below
Sorting:
- PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT☆2,878Updated this week
- ☆338Updated last year
- [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment☆1,937Updated last year
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, …☆2,520Updated this week
- Simplify your onnx model☆4,222Updated 2 months ago
- ONNX Optimizer☆770Updated last week
- Neural Network Compression Framework for enhanced OpenVINO™ inference☆1,098Updated this week
- A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.☆1,572Updated 8 months ago
- TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.☆854Updated 2 months ago
- Tensorflow Backend for ONNX☆1,325Updated last year
- A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…☆2,263Updated 8 months ago
- Model Quantization Benchmark☆847Updated 6 months ago
- ONNX-TensorRT: TensorRT backend for ONNX☆3,165Updated 2 months ago
- A parser, editor and profiler tool for ONNX models.☆462Updated last week
- A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. …☆1,512Updated this week
- Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distille…☆4,401Updated 2 years ago
- Brevitas: neural network quantization in PyTorch☆1,422Updated this week
- An easy to use PyTorch to TensorRT converter☆4,826Updated last year
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆2,158Updated this week
- A curated list of neural network pruning resources.☆2,481Updated last year
- A coding-free framework built on PyTorch for reproducible deep learning studies. PyTorch Ecosystem. 🏆26 knowledge distillation methods p…☆1,568Updated last week
- Reference implementations of MLPerf® inference benchmarks☆1,480Updated this week
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,766Updated last year
- A primitive library for neural network☆1,369Updated 11 months ago
- Convert ONNX models to PyTorch.☆707Updated 3 weeks ago
- ☆1,011Updated last year
- Transform ONNX model to PyTorch representation☆342Updated last week
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆875Updated 2 weeks ago
- Arm NN ML Software.☆1,287Updated last week
- A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.☆1,558Updated this week