quic / ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆660Updated this week
Alternatives and similar repositories for ai-hub-models:
Users that are interested in ai-hub-models are comparing it to the libraries listed below
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆168Updated 2 weeks ago
- LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-…☆330Updated this week
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆512Updated this week
- ☆130Updated 3 weeks ago
- On-device AI across mobile, embedded and edge for PyTorch☆2,667Updated this week
- A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, et…☆832Updated 2 weeks ago
- ☆316Updated last year
- Generative AI extensions for onnxruntime☆667Updated this week
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆772Updated last week
- Efficient Inference of Transformer models☆427Updated 7 months ago
- Demonstration of running a native LLM on Android device.☆127Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆371Updated this week
- Examples for using ONNX Runtime for machine learning inferencing.☆1,344Updated 2 months ago
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆249Updated this week
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆55Updated 6 months ago
- Demonstration of combine YOLO and depth estimation on Android device.☆43Updated this week
- TinyChatEngine: On-Device LLM Inference Library☆830Updated 8 months ago
- ☆32Updated 3 weeks ago
- Fast Multimodal LLM on Mobile Devices☆781Updated last week
- High-performance, optimized pre-trained template AI application pipelines for systems using Hailo devices☆125Updated last week
- Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high …☆56Updated 5 months ago
- A Toolkit to Help Optimize Onnx Model☆129Updated this week
- A parser, editor and profiler tool for ONNX models.☆422Updated 2 months ago
- PyTorch Neural Network eXchange☆565Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆262Updated 5 months ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆329Updated this week
- Common utilities for ONNX converters☆261Updated 4 months ago
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆252Updated 5 months ago
- Pytorch to Keras/Tensorflow/TFLite conversion made intuitive☆298Updated 3 weeks ago
- Low-bit LLM inference on CPU with lookup table☆705Updated 2 months ago