quic / ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆497Updated last week
Related projects ⓘ
Alternatives and complementary repositories for ai-hub-models
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆373Updated this week
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆66Updated last week
- ☆105Updated last month
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆155Updated this week
- Low-bit LLM inference on CPU with lookup table☆588Updated this week
- Generative AI extensions for onnxruntime☆520Updated this week
- TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillati…☆573Updated this week
- On-device AI across mobile, embedded and edge for PyTorch☆2,209Updated this week
- ☆303Updated 11 months ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆410Updated this week
- Fast Multimodal LLM on Mobile Devices☆534Updated this week
- Intel® NPU Acceleration Library☆511Updated this week
- LLaMa/RWKV onnx models, quantization and testcase☆354Updated last year
- A pytorch quantization backend for optimum☆831Updated last week
- LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-…☆151Updated last month
- Strong and Open Vision Language Assistant for Mobile Devices☆1,044Updated 7 months ago
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,228Updated this week
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,163Updated 2 weeks ago
- ☆1,023Updated 10 months ago
- 本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。☆111Updated 7 months ago
- Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massiv…☆705Updated 3 weeks ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,258Updated 4 months ago
- Demonstration of running a native LLM on Android device.☆75Updated this week
- Universal cross-platform tokenizers binding to HF and sentencepiece☆274Updated last week
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆529Updated this week
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,139Updated last month
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆340Updated this week
- ☆22Updated 2 months ago
- Examples for using ONNX Runtime for machine learning inferencing.☆1,217Updated 2 weeks ago
- Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high …☆55Updated 3 weeks ago