zhouwg / ggml-hexagon
try to build a fully open-source ggml-hexagon backend for llama.cpp on Android phone equipped with Qualcomm's Hexagon NPU, details can be seen at https://github.com/zhouwg/ggml-hexagon/discussions/18
☆16Updated this week
Alternatives and similar repositories for ggml-hexagon:
Users that are interested in ggml-hexagon are comparing it to the libraries listed below
- LLM inference in C/C++☆36Updated this week
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆62Updated last week
- llm deploy project based onnx.☆36Updated 6 months ago
- ☆32Updated 9 months ago
- ☆34Updated 2 weeks ago
- ☆124Updated last year
- mperf是一个面向移动/嵌入式平台的算子性能调优工具箱☆182Updated last year
- ☆30Updated 7 months ago
- Large Language Model Onnx Inference Framework☆32Updated 3 months ago
- Run Chinese MobileBert model on SNPE.☆14Updated last year
- llama 2 Inference☆42Updated last year
- Sophgo AI chips driver and runtime library.☆19Updated last week
- A converter for llama2.c legacy models to ncnn models.☆87Updated last year
- stable diffusion using mnn☆68Updated last year
- ☆10Updated 9 months ago
- The rknn2 API uses the secondary encapsulation of the process, which is easy for everyone to call. It is applicable to rk356x rk3588☆45Updated 2 years ago
- ☆40Updated 2 years ago
- Run Large Language Models on RK3588 with GPU-acceleration☆98Updated last year
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆33Updated this week
- snpe tutorial☆10Updated last year
- ☆156Updated 3 weeks ago
- Detect CPU features with single-file☆387Updated this week
- ggml学习笔记,ggml是一个机器学习的推理框架☆15Updated last year
- c++实现的clip推理,模型有一点点改动,但是不大,改动和导出模型的代码可以在readme里找到,模型文件都在Releases里,包括AX650的模型。新增支持ChineseCLIP☆30Updated 4 months ago
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆90Updated last week
- Standalone Flash Attention v2 kernel without libtorch dependency☆108Updated 7 months ago
- Run generative AI models in sophgo BM1684X☆199Updated this week
- ☆11Updated last month
- Inference deployment of the llama3☆11Updated last year
- ☆84Updated 2 years ago