quic / aimetLinks
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
☆2,526Updated last week
Alternatives and similar repositories for aimet
Users that are interested in aimet are comparing it to the libraries listed below
Sorting:
- Simplify your onnx model☆4,261Updated 4 months ago
- PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT☆2,915Updated this week
- Neural Network Compression Framework for enhanced OpenVINO™ inference☆1,112Updated this week
- ☆340Updated 2 years ago
- [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment☆1,940Updated 2 years ago
- Tensorflow Backend for ONNX☆1,325Updated last year
- ONNX-TensorRT: TensorRT backend for ONNX☆3,175Updated 2 months ago
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, …☆2,561Updated this week
- A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.☆1,596Updated last month
- ONNX Optimizer☆784Updated this week
- TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.☆863Updated last week
- An easy to use PyTorch to TensorRT converter☆4,839Updated last year
- A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…☆2,299Updated 10 months ago
- Brevitas: neural network quantization in PyTorch☆1,461Updated last week
- Model Quantization Benchmark☆855Updated 8 months ago
- A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresse…☆1,744Updated last week
- Convert ONNX models to PyTorch.☆719Updated 2 months ago
- A parser, editor and profiler tool for ONNX models.☆470Updated 2 months ago
- ☆1,045Updated 2 years ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,774Updated last year
- [NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep L…☆916Updated last year
- Bolt is a deep learning library with high performance and heterogeneous flexibility.☆955Updated 8 months ago
- Arm NN ML Software.☆1,291Updated last month
- Reference implementations of MLPerf® inference benchmarks☆1,514Updated 2 weeks ago
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆902Updated last week
- CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.☆2,626Updated last month
- [CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.☆3,229Updated 3 months ago
- A curated list of neural network pruning resources.☆2,488Updated last year
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆2,213Updated this week
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…☆3,055Updated this week