lx200916 / ChatBotAppLinks
☆41Updated 7 months ago
Alternatives and similar repositories for ChatBotApp
Users that are interested in ChatBotApp are comparing it to the libraries listed below
Sorting:
- the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, https://github.com/ggml…☆35Updated 3 months ago
- High-speed and easy-use LLM serving framework for local deployment☆130Updated 3 months ago
- Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"☆56Updated 9 months ago
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆83Updated this week
- ☆90Updated 3 weeks ago
- Fast Multimodal LLM on Mobile Devices☆1,156Updated this week
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆93Updated this week
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆68Updated last year
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆87Updated 2 weeks ago
- LLM inference in C/C++☆46Updated last week
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆327Updated last week
- This is a list of awesome edgeAI inference related papers.☆99Updated last year
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Updated 3 years ago
- Awesome Mobile LLMs☆267Updated 3 weeks ago
- 分层解耦的深度学习推理引擎☆76Updated 8 months ago
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆26Updated 4 years ago
- ☆78Updated 2 years ago
- mperf是一个面向移动/嵌入式平台的算子性能调优工具箱☆191Updated 2 years ago
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆111Updated 3 months ago
- hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.☆50Updated 2 years ago
- ☆168Updated last week
- Penn CIS 5650 (GPU Programming and Architecture) Final Project☆45Updated last year
- LLM inference in C/C++☆20Updated 2 weeks ago
- Demonstration of running a native LLM on Android device.☆194Updated this week
- High performance Transformer implementation in C++.☆140Updated 9 months ago
- 机器学习编译 陈天奇☆48Updated 2 years ago
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆90Updated 2 years ago
- llm-export can export llm model to onnx.☆320Updated 2 weeks ago
- Triton adapter for Ascend. Mirror of https://gitee.com/ascend/triton-ascend☆81Updated this week
- 使用 CUDA C++ 实现的 llama 模型 推理框架☆62Updated last year