MollySophia / rwkv-qualcomm
Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK
☆59Updated this week
Alternatives and similar repositories for rwkv-qualcomm:
Users that are interested in rwkv-qualcomm are comparing it to the libraries listed below
- ☆124Updated last year
- Inference RWKV with multiple supported backends.☆35Updated this week
- ☆32Updated 8 months ago
- Infere RWKV on NCNN☆48Updated 6 months ago
- A converter for llama2.c legacy models to ncnn models.☆87Updated last year
- A converter and basic tester for rwkv onnx☆42Updated last year
- llm deploy project based onnx.☆31Updated 5 months ago
- ☆84Updated 2 years ago
- stable diffusion using mnn☆65Updated last year
- qwen2 and llama3 cpp implementation☆43Updated 9 months ago
- ☆18Updated 2 months ago
- Inference TinyLlama models on ncnn☆24Updated last year
- libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻☆60Updated last year
- A Toolkit to Help Optimize Onnx Model☆125Updated this week
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆70Updated 11 months ago
- UIE(Universal Information Extraction) infer by ncnn☆12Updated 6 months ago
- This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the …☆47Updated 3 months ago
- mnn asr demo.☆13Updated this week
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆41Updated last year
- A Toolkit to Help Optimize Large Onnx Model☆153Updated 10 months ago
- An easy way to run, test, benchmark and tune OpenCL kernel files☆23Updated last year
- ncnn HiFi-GAN☆26Updated 5 months ago
- This is an inference framework for the RWKV large language model implemented purely in native PyTorch. The official native implementation…☆126Updated 8 months ago
- A quantization algorithm for LLM☆136Updated 9 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆46Updated last year
- ☆30Updated 6 months ago
- Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"☆50Updated 2 months ago
- ☆38Updated last week
- Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton☆20Updated last week
- NeRF in NCNN with c++ & vulkan☆67Updated last year