quic / ai-hub-apps
The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆168Updated 2 weeks ago
Alternatives and similar repositories for ai-hub-apps:
Users that are interested in ai-hub-apps are comparing it to the libraries listed below
- The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.)…☆660Updated this week
- Demonstration of running a native LLM on Android device.☆127Updated this week
- ☆130Updated 3 weeks ago
- LLM inference in C/C++☆35Updated last week
- LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-…☆330Updated this week
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆512Updated this week
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆62Updated this week
- 使用Android手机的CPU推理stable diffusion☆150Updated last year
- ☆32Updated 3 weeks ago
- Demonstration of combine YOLO and depth estimation on Android device.☆43Updated this week
- 本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。☆149Updated last year
- Fast Multimodal LLM on Mobile Devices☆781Updated last week
- A Toolkit to Help Optimize Onnx Model☆129Updated this week
- ☆29Updated this week
- llama.cpp tutorial on Android phone☆97Updated 8 months ago
- focus on implementation of ggml-hexagon backend for Qualcomm's Hexagon NPU, details can be seen at https://github.com/zhouwg/ggml-hexagon…☆13Updated this week
- ☆28Updated last year
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆249Updated this week
- DragGan in NCNN with c++☆50Updated last year
- On-device Speech Recognition for Android☆73Updated 3 weeks ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆55Updated 6 months ago
- Generative AI extensions for onnxruntime☆667Updated this week
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆192Updated 2 months ago
- A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, et…☆832Updated 2 weeks ago
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆72Updated 11 months ago
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆252Updated 5 months ago
- stable diffusion using mnn☆65Updated last year
- llama.cpp fork with additional SOTA quants and improved performance☆231Updated this week
- ☆84Updated 2 years ago
- ☆236Updated 4 months ago