lovelyzzkei / QNN-Android-ServerLinks
Let's use Qualcomm NPU in Android
☆17Updated 11 months ago
Alternatives and similar repositories for QNN-Android-Server
Users that are interested in QNN-Android-Server are comparing it to the libraries listed below
Sorting:
- Run Chinese MobileBert model on SNPE.☆15Updated 2 years ago
- PyTorch Quantization Aware Training(QAT,量化感知训练)☆42Updated 2 years ago
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆90Updated 2 weeks ago
- Demonstration of combine YOLO and depth estimation on Android device.☆66Updated 2 months ago
- deepstream + cuda,yolo26,yolo-master,yolo11,yolov8,sam,transformer, etc.☆35Updated last week
- ☆179Updated 2 weeks ago
- Try to export the ONNX QDQ model that conforms to the AXERA NPU quantization specification. Currently, only w8a8 is supported.☆11Updated last year
- base quantization methods including: QAT, PTQ, per_channel, per_tensor, dorefa, lsq, adaround, omse, Histogram, bias_correction.etc☆51Updated 3 years ago
- A Toolkit to Help Optimize Large Onnx Model☆163Updated 3 months ago
- A yolov7-tiny model inference applied on qualcomm snpe for pedestrian detection with embedded system.☆13Updated last year
- A simple tutorial of SNPE.☆183Updated 2 years ago
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆111Updated this week
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆78Updated 8 months ago
- High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI☆227Updated 3 weeks ago
- A set of examples around MegEngine☆31Updated 2 years ago
- This is a repository to practice multi-thread programming in C++☆27Updated last year
- Llama3 Streaming Chat Sample☆22Updated last year
- llm deploy project based onnx.☆49Updated last year
- An onnx-based quantitation tool.☆71Updated 2 years ago
- EasyNN是一个面向教学而开发的神经网络推理框架,旨在让大家0基础也能自主完成推理框架编写!☆37Updated last year
- simplify >2GB large onnx model☆70Updated last year
- ☆26Updated 2 years ago
- stable diffusion using mnn☆67Updated 2 years ago
- Offline Quantization Tools for Deploy.☆142Updated 2 years ago
- A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.☆21Updated last month
- trt-hackathon-2022 三等奖方案☆10Updated 2 years ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆43Updated 2 years ago
- snpe tutorial☆10Updated 2 years ago
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆23Updated 3 weeks ago
- Large Language Model Onnx Inference Framework☆36Updated 2 months ago