MollySophia / rwkv-qualcommLinks

Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK

☆88

Alternatives and similar repositories for rwkv-qualcomm

Users that are interested in rwkv-qualcomm are comparing it to the libraries listed below

Sorting:

daquexian / faster-rwkv
☆125Updated last year
MollySophia / rwkv-mobile
Inference RWKV with multiple supported backends.
☆70Updated last week
wangzhaode / onnx-llm
llm deploy project based onnx.
☆47Updated last year
EdVince / llm-cpp
☆33Updated last year
lrw04 / llama2.c-to-ncnn
A converter for llama2.c legacy models to ncnn models.
☆80Updated last year
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆67Updated 2 years ago
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆162Updated last month
jeffzhou2000 / ggml-hexagon
the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, https://github.com/ggml…
☆35Updated 4 months ago
inisis / OnnxSlim
A Toolkit to Help Optimize Onnx Model
☆256Updated this week
EdVince / diffusers-ncnn
☆84Updated 2 years ago
gesanqiu / Chinese_MobileBert_on_SNPE
Run Chinese MobileBert model on SNPE.
☆15Updated 2 years ago
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆48Updated last year
lovemefan / ggml-learning-notes
ggml学习笔记，ggml是一个机器学习的推理框架
☆18Updated last year
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆86Updated last year
saic-fi / MobileQuant
[EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models
☆68Updated last year
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated last week
quic / ai-engine-direct-helper
QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …
☆92Updated this week
chraac / llama.cpp
LLM inference in C/C++
☆47Updated last week
wangkuiyi / huggingface-tokenizer-in-cxx
☆70Updated 2 years ago
marty1885 / llama.cpp
My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend
☆110Updated 3 weeks ago
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆333Updated last month
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆69Updated last year
AXERA-TECH / ax-llm
Explore LLM model deployment based on AXera's AI chips
☆130Updated last week
hisrg / SNPE
Snapdragon Neural Processing Engine (SNPE) SDKThe Snapdragon Neural Processing Engine (SNPE) is a Qualcomm Snapdragon software accelerate…
☆35Updated 3 years ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago
lrw04 / tinyllamas-ncnn
Inference TinyLlama models on ncnn
☆24Updated 2 years ago
MollySophia / rwkv-ncnn
Infere RWKV on NCNN
☆49Updated last year
staghado / vit.cpp
Inference Vision Transformer (ViT) in plain C/C++ with ggml
☆300Updated last year
xxxxyu / FlexNN
Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"
☆56Updated 10 months ago
haozixu / llama.cpp-npu
☆34Updated last month