quic / ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
☆672Updated 2 weeks ago
Alternatives and similar repositories for ai-hub-models:
Users that are interested in ai-hub-models are comparing it to the libraries listed below
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆180Updated 2 weeks ago
- ☆132Updated last month
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆555Updated this week
- Generative AI extensions for onnxruntime☆693Updated this week
- LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-…☆365Updated this week
- ☆321Updated last year
- Demonstration of running a native LLM on Android device.☆129Updated last week
- nvidia-modelopt is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculat…☆870Updated last week
- Fast Multimodal LLM on Mobile Devices☆830Updated last month
- On-device AI across mobile, embedded and edge for PyTorch☆2,747Updated this week
- Examples for using ONNX Runtime for machine learning inferencing.☆1,354Updated last week
- Low-bit LLM inference on CPU with lookup table☆735Updated 3 months ago
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆388Updated this week
- LLaMa/RWKV onnx models, quantization and testcase☆361Updated last year
- Efficient Inference of Transformer models☆432Updated 8 months ago
- Strong and Open Vision Language Assistant for Mobile Devices☆1,198Updated last year
- Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high …☆59Updated 6 months ago
- Advanced Quantization Algorithm for LLMs/VLMs.☆438Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆375Updated this week
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆266Updated last year
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆260Updated this week
- A Toolkit to Help Optimize Onnx Model☆140Updated this week
- A parser, editor and profiler tool for ONNX models.☆425Updated 3 months ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆460Updated this week
- TinyChatEngine: On-Device LLM Inference Library☆837Updated 9 months ago
- ONNX Optimizer☆696Updated 3 weeks ago
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,380Updated this week
- Official implementation of Half-Quadratic Quantization (HQQ)☆791Updated this week
- Intel® NPU Acceleration Library☆667Updated 3 months ago
- llm-export can export llm model to onnx.☆282Updated 3 months ago