quic / ai-hub-modelsLinks
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆745Updated last week
Alternatives and similar repositories for ai-hub-models
Users that are interested in ai-hub-models are comparing it to the libraries listed below
Sorting:
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆245Updated last week
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆721Updated this week
- LiteRT continues the legacy of TensorFlow Lite as the trusted, high-performance runtime for on-device AI. Now with LiteRT Next, we're exp…☆658Updated this week
- ☆145Updated last month
- Generative AI extensions for onnxruntime☆765Updated this week
- On-device AI across mobile, embedded and edge for PyTorch☆3,055Updated this week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆310Updated this week
- Examples for using ONNX Runtime for machine learning inferencing.☆1,438Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆478Updated last week
- A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. …☆1,068Updated last week
- TinyChatEngine: On-Device LLM Inference Library☆875Updated last year
- ☆332Updated last year
- Demonstration of running a native LLM on Android device.☆155Updated this week
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆830Updated this week
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆287Updated last year
- Conversion of PyTorch Models into TFLite☆387Updated 2 years ago
- Efficient Inference of Transformer models☆443Updated 11 months ago
- PyTorch Neural Network eXchange☆602Updated this week
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,310Updated 3 months ago
- Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high …☆62Updated 2 months ago
- Fast Multimodal LLM on Mobile Devices☆965Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆401Updated this week
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,454Updated this week
- A pytorch quantization backend for optimum☆968Updated 3 weeks ago
- Low-bit LLM inference on CPU/NPU with lookup table☆833Updated last month
- A parser, editor and profiler tool for ONNX models.☆446Updated last month
- This repository contains tutorials and examples for Triton Inference Server☆736Updated this week
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆56Updated this week
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Tra…☆531Updated this week
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆296Updated 9 months ago