qualcomm/aimet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qualcomm/aimet)

qualcomm / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

☆2,670

Alternatives and similar repositories for aimet

Users that are interested in aimet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

quic / aimet-model-zoo
View on GitHub
☆346Feb 12, 2026Updated 5 months ago
ModelTC / MQBench
View on GitHub
Model Quantization Benchmark
☆874Apr 20, 2025Updated last year
jakc4103 / DFQ
View on GitHub
PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.
☆264Oct 3, 2023Updated 2 years ago
OpenPPL / ppq
View on GitHub
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
☆1,807Mar 28, 2024Updated 2 years ago
AI-Efficiency / Awesome-Model-Quantization
View on GitHub
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…
☆2,409Jul 10, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
openvinotoolkit / nncf
View on GitHub
Neural Network Compression Framework for enhanced OpenVINO™ inference
☆1,183Updated this week
onnxsim / onnxsim
View on GitHub
Simplify your onnx model
☆4,372Updated this week
Zhen-Dong / HAWQ
View on GitHub
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
☆462May 15, 2023Updated 3 years ago
yhhhli / BRECQ
View on GitHub
Pytorch implementation of BRECQ, ICLR 2021
☆300Aug 1, 2021Updated 4 years ago
666DZY666 / micronet
View on GitHub
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantiz…
☆2,266May 6, 2025Updated last year
Xilinx / brevitas
View on GitHub
Brevitas: neural network quantization in PyTorch
☆1,555Updated this week
Qualcomm-AI-research / transformer-quantization
View on GitHub
☆212Nov 9, 2021Updated 4 years ago
intel / neural-compressor
View on GitHub
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, …
☆2,684Updated this week
alibaba / TinyNeuralNetwork
View on GitHub
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
☆879Mar 3, 2026Updated 4 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
apache / tvm
View on GitHub
Open Machine Learning Compiler Framework
☆13,607Updated this week
VainF / Torch-Pruning
View on GitHub
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
☆3,328Sep 7, 2025Updated 10 months ago
itayhubara / CalibTIP
View on GitHub
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
☆97Jun 10, 2021Updated 5 years ago
Jermmy / pytorch-quantization-demo
View on GitHub
A simple network quantization demo using pytorch from scratch.
☆543Jun 18, 2023Updated 3 years ago
NVIDIA / TensorRT
View on GitHub
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source compone…
☆13,182Jul 7, 2026Updated 2 weeks ago
amirgholami / ZeroQ
View on GitHub
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
☆280Dec 8, 2023Updated 2 years ago
mit-han-lab / once-for-all
View on GitHub
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
☆1,953Dec 14, 2023Updated 2 years ago
OpenPPL / ppl.nn
View on GitHub
A primitive library for neural network
☆1,367Nov 24, 2024Updated last year
deepglint / EasyQuant
View on GitHub
EasyQuant(EQ) is an efficient and simple post-training quantization method via effectively optimizing the scales of weights and activatio…
☆407Nov 22, 2022Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
pytorch / TensorRT
View on GitHub
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
☆2,980Updated this week
Tencent / ncnn
View on GitHub
ncnn is a high-performance neural network inference framework optimized for the mobile platform
☆23,580Updated this week
pytorch / QNNPACK
View on GitHub
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
☆1,550Aug 28, 2019Updated 6 years ago
ZhangGe6 / onnx-modifier
View on GitHub
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
☆1,624Jun 27, 2026Updated 3 weeks ago
google / gemmlowp
View on GitHub
Low-precision matrix multiplication
☆1,845Jan 29, 2024Updated 2 years ago
ynahshan / nn-quantization-pytorch
View on GitHub
☆59Dec 8, 2020Updated 5 years ago
IST-DASLab / gptq
View on GitHub
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
☆2,337Mar 27, 2024Updated 2 years ago
he-y / Awesome-Pruning
View on GitHub
A curated list of neural network pruning resources.
☆2,496Apr 4, 2024Updated 2 years ago
yhhhli / APoT_Quantization
View on GitHub
PyTorch implementation for the APoT quantization (ICLR 2020)
☆288Dec 11, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
qualcomm / ai-hub-models
View on GitHub
Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) an…
☆1,167Updated this week
google / XNNPACK
View on GitHub
High-efficiency floating-point neural network inference operators for mobile, server, and Web
☆2,403Updated this week
alibaba / MNN
View on GitHub
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.
☆15,715Updated this week
submission2019 / cnn-quantization
View on GitHub
Quantization of Convolutional Neural networks.
☆250Aug 5, 2024Updated last year
ThanatosShinji / onnx-tool
View on GitHub
A tool for parsing, editing, optimizing, and profiling ONNX models.
☆491Jun 8, 2026Updated last month
huawei-noah / bolt
View on GitHub
Bolt is a deep learning library with high performance and heterogeneous flexibility.
☆958Apr 11, 2025Updated last year
zhutmost / lsq-net
View on GitHub
Unofficial implementation of LSQ-Net, a neural network quantization framework
☆316May 8, 2024Updated 2 years ago