lx200916 / ChatBotAppLinks
☆36Updated 4 months ago
Alternatives and similar repositories for ChatBotApp
Users that are interested in ChatBotApp are comparing it to the libraries listed below
Sorting:
- the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, https://github.com/ggml…☆27Updated 3 weeks ago
- High-speed and easy-use LLM serving framework for local deployment☆115Updated 4 months ago
- Fast Multimodal LLM on Mobile Devices☆983Updated this week
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆59Updated this week
- Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"☆54Updated 6 months ago
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆252Updated last week
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆77Updated this week
- mperf是一个面向移动/嵌入式平台的算子性能调优工具箱☆188Updated last year
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆62Updated 2 weeks ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆66Updated 10 months ago
- LLM inference in C/C++☆44Updated this week
- Awesome Mobile LLMs☆226Updated last week
- LLM inference in C/C++☆19Updated last week
- Triton adapter for Ascend. Mirror of https://gitee.com/ascend/triton-ascend☆61Updated last week
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Updated 3 years ago
- ☆70Updated last month
- ☆161Updated 2 weeks ago
- Summary of some awesome work for optimizing LLM inference☆92Updated 2 months ago
- Penn CIS 5650 (GPU Programming and Architecture) Final Project☆38Updated last year
- 分层解耦的深度学习推理引擎☆75Updated 5 months ago
- 机器学习编译 陈天奇☆38Updated 2 years ago
- Efficient inference of large language models.☆150Updated last month
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆82Updated 2 years ago
- Demonstration of running a native LLM on Android device.☆161Updated this week
- Low-bit LLM inference on CPU/NPU with lookup table☆836Updated 2 months ago
- This is a list of awesome edgeAI inference related papers.☆97Updated last year
- ☆151Updated last month
- hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.☆49Updated 2 years ago
- ☆32Updated 11 months ago
- This is a demo how to write a high performance convolution run on apple silicon☆54Updated 3 years ago