google-ai-edge / ai-edge-quantizer
AI Edge Quantizer: flexible post training quantization for LiteRT models.
☆23Updated last month
Alternatives and similar repositories for ai-edge-quantizer:
Users that are interested in ai-edge-quantizer are comparing it to the libraries listed below
- A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB,…☆15Updated 8 months ago
- New operators for the ReferenceEvaluator, new kernels for onnxruntime, CPU, CUDA☆31Updated 4 months ago
- Model compression for ONNX☆81Updated 2 months ago
- Simple tool for partial optimization of ONNX. Further optimize some models that cannot be optimized with onnx-optimizer and onnxsim by se…☆19Updated 8 months ago
- Convert tflite to JSON and make it editable in the IDE. It also converts the edited JSON back to tflite binary.☆27Updated last year
- ONNX and TensorRT implementation of Whisper☆61Updated last year
- ☆18Updated 2 weeks ago
- ☆22Updated 3 weeks ago
- ☆21Updated this week
- Exports the ONNX file to a JSON file and JSON dict.☆33Updated 2 years ago
- TensorFlow, TensorFlow-Lite Pytorch, Torchvision, TensorRT Benchmarks☆22Updated 2 months ago
- Experiments with BitNet inference on CPU☆52Updated 9 months ago
- APPy (Annotated Parallelism for Python) enables users to annotate loops and tensor expressions in Python with compiler directives akin to…☆23Updated last month
- cross-platform high speed inference SDK☆34Updated last week
- TAO Toolkit deep learning networks with TensorFlow 1.x backend☆13Updated 11 months ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆54Updated 4 months ago
- ONNX Adapter for model-explorer☆27Updated 4 months ago
- A very simple tool that compresses the overall size of the ONNX model by aggregating duplicate constant values as much as possible.☆52Updated 2 years ago
- ☆21Updated 3 months ago
- オーディオスペクトラムや波形をOpenCVで描画するサンプル☆13Updated 2 years ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆106Updated this week
- Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.☆17Updated 9 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆38Updated 8 months ago
- ☆12Updated 3 weeks ago
- Memory Optimizations for Deep Learning (ICML 2023)☆62Updated 10 months ago
- EfficientSAM をColaboraotry上でONNX推論するサンプル☆10Updated 9 months ago
- Describing How to Enable OpenVINO Execution Provider for ONNX Runtime☆19Updated 4 years ago
- Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer wo…☆12Updated 3 years ago
- Test data for DALI project☆41Updated 2 months ago