quic / ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆651Updated last week
Alternatives and similar repositories for ai-hub-models:
Users that are interested in ai-hub-models are comparing it to the libraries listed below
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆168Updated 2 weeks ago
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆512Updated this week
- LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-…☆330Updated this week
- ☆130Updated 3 weeks ago
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆772Updated last week
- A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, et…☆832Updated 2 weeks ago
- Efficient Inference of Transformer models☆427Updated 7 months ago
- Conversion of PyTorch Models into TFLite☆371Updated 2 years ago
- ☆316Updated last year
- Fast Multimodal LLM on Mobile Devices☆781Updated last week
- Generative AI extensions for onnxruntime☆667Updated this week
- A parser, editor and profiler tool for ONNX models.☆422Updated 2 months ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆456Updated this week
- Low-bit LLM inference on CPU with lookup table☆705Updated 2 months ago
- Common utilities for ONNX converters☆261Updated 4 months ago
- On-device AI across mobile, embedded and edge for PyTorch☆2,667Updated this week
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,261Updated this week
- Examples for using ONNX Runtime for machine learning inferencing.☆1,344Updated 2 months ago
- 本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。☆149Updated last year
- Neural Network Compression Framework for enhanced OpenVINO™ inference☆991Updated this week
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆55Updated 6 months ago
- PyTorch Neural Network eXchange☆565Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆371Updated this week
- Advanced Quantization Algorithm for LLMs/VLMs.☆413Updated this week
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆264Updated 11 months ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆329Updated this week
- ONNX Optimizer☆686Updated 2 weeks ago
- LLaMa/RWKV onnx models, quantization and testcase☆359Updated last year
- A simple tutorial of SNPE.☆168Updated 2 years ago
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆241Updated last month