the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, https://github.com/ggml-org/llama.cpp/pull/12326. not maintained since Jul 15 2025
☆38Jul 14, 2025Updated 7 months ago
Alternatives and similar repositories for ggml-hexagon
Users that are interested in ggml-hexagon are comparing it to the libraries listed below
Sorting:
- Inference deployment of the llama3☆11Apr 21, 2024Updated last year
- LLM inference in C/C++☆48Feb 27, 2026Updated last week
- ☆10Jul 18, 2024Updated last year
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆90Feb 14, 2026Updated 3 weeks ago
- workbench for learning and practicing on-device AI technology in real scenario with online-TV on Android phone, powered by ggml(llama.cpp…☆187Jun 12, 2025Updated 8 months ago
- ggml学习笔记,ggml是一个机器学习的推理框架☆18Mar 24, 2024Updated last year
- 瑞芯微芯片的rknn推理框架部署(yolo模型)☆13Jul 17, 2025Updated 7 months ago
- ☆19Dec 29, 2023Updated 2 years ago
- RKNN-YOLOV5-BatchInference-MultiThreadingYOLOV5多张图片多线程C++推理☆22Nov 6, 2023Updated 2 years ago
- Sophgo AI chips driver and runtime library.☆24Feb 5, 2026Updated last month
- A one-page-only CGraph-API-liked DAG project.☆26Feb 11, 2025Updated last year
- ☆27Mar 17, 2025Updated 11 months ago
- RKNN模型推理部署模板☆24Aug 11, 2023Updated 2 years ago
- ☆28Jun 30, 2025Updated 8 months ago
- a website for accessing many models through api(deepseek、Qwen、Hunyuan etc.)☆17Jul 12, 2025Updated 7 months ago
- Example of SenseCraft Model Assistant Model deployment related to ESP32☆32Apr 9, 2025Updated 11 months ago
- LLM inference in C/C++☆26Jan 27, 2026Updated last month
- NanoTrack(@HonglinChu), C++ TensorRT deployment. MAX 250 FPS!☆28Nov 6, 2023Updated 2 years ago
- ☆39Feb 12, 2026Updated 3 weeks ago
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆49Feb 23, 2026Updated last week
- Port of Funasr's Paraformer model in C/C++☆40Jun 19, 2024Updated last year
- ☆34Sep 8, 2024Updated last year
- stable diffusion using mnn☆67Sep 28, 2023Updated 2 years ago
- This project is intended to build and deploy an SNPE model on Qualcomm Devices, which are having unsupported layers which are not part of…☆10Oct 4, 2021Updated 4 years ago
- 并行计算学习笔记☆44Feb 25, 2017Updated 9 years ago
- This python script can help you to detect what object is in moving.☆12Nov 28, 2018Updated 7 years ago
- PyTorch for RISC-V Architecture on OpenEuler 24.03☆13Jun 27, 2024Updated last year
- A minimal ESP32-S3 dev kit that is compatible with usual breadboard, featuring dual USB Type-C ports and exposing all available GPIO pins…☆12Jul 22, 2023Updated 2 years ago
- Model compression for ONNX☆100Mar 1, 2026Updated last week
- Large Language Model Onnx Inference Framework☆35Nov 25, 2025Updated 3 months ago
- 使用ONNXRuntime部署一种用于边缘检测的轻量级密集卷积神经网络LDC,包含C++和Python两个版本的程序☆11Apr 24, 2023Updated 2 years ago
- PyTorch Quantization Aware Training(QAT,量化感知训练)☆42Oct 13, 2023Updated 2 years ago
- Implementation of yolo v10 in c++ std 17 over opencv and onnxruntime☆90Sep 28, 2024Updated last year
- Try to export the ONNX QDQ model that conforms to the AXERA NPU quantization specification. Currently, only w8a8 is supported.☆10Sep 10, 2024Updated last year
- ☆10Sep 4, 2025Updated 6 months ago
- workflow of nndeploy☆13Nov 5, 2025Updated 4 months ago
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated last year
- trt-hackathon-2022 三等奖方案☆10Mar 6, 2023Updated 3 years ago