quic / ai-hub-modelsLinks
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆815Updated 2 weeks ago
Alternatives and similar repositories for ai-hub-models
Users that are interested in ai-hub-models are comparing it to the libraries listed below
Sorting:
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆320Updated 2 weeks ago
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆811Updated this week
- LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via e…☆894Updated this week
- ☆164Updated 4 months ago
- Generative AI extensions for onnxruntime☆861Updated this week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆364Updated last week
- On-device AI across mobile, embedded and edge for PyTorch☆3,374Updated this week
- A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. …☆1,464Updated last week
- ☆337Updated last year
- Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high …☆66Updated 2 months ago
- Examples for using ONNX Runtime for machine learning inferencing.☆1,516Updated last week
- TinyChatEngine: On-Device LLM Inference Library☆906Updated last year
- Demonstration of running a native LLM on Android device.☆191Updated last month
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆869Updated this week
- Conversion of PyTorch Models into TFLite☆392Updated 2 years ago
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆418Updated last week
- Fast Multimodal LLM on Mobile Devices☆1,132Updated this week
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆295Updated last year
- A Toolkit to Help Optimize Onnx Model☆228Updated this week
- This repository contains tutorials and examples for Triton Inference Server☆792Updated 2 weeks ago
- Low-bit LLM inference on CPU/NPU with lookup table☆876Updated 4 months ago
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,471Updated this week
- A parser, editor and profiler tool for ONNX models.☆460Updated 2 months ago
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆327Updated last year
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.☆679Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆502Updated this week
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆91Updated this week
- A pytorch quantization backend for optimum☆999Updated last week
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆419Updated last week
- LLM inference in C/C++☆46Updated last week