CodeLinaro / llama.cppLinks
LLM inference in C/C++
☆20Updated 3 months ago
Alternatives and similar repositories for llama.cpp
Users that are interested in llama.cpp are comparing it to the libraries listed below
Sorting:
- LLM inference in C/C++☆48Updated this week
- mperf是一个面向移动/嵌入式平台的算子性能调优工具箱☆193Updated 2 years ago
- Detect CPU features with single-file☆442Updated last month
- the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, https://github.com/ggml…☆35Updated 6 months ago
- A tool which profiles Vulkan devices to find their peak capacities☆159Updated 3 weeks ago
- The note of Qualcomm OpenCL SDK☆37Updated 7 years ago
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆111Updated this week
- A repo for llm on ncnn☆189Updated last month
- This is a demo how to write a high performance convolution run on apple silicon☆57Updated 3 years ago
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆51Updated last year
- Self-implemented NN operators for Qualcomm's Hexagon NPU☆46Updated 4 months ago
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆90Updated this week
- OpenAI Triton backend for Intel® GPUs☆226Updated this week
- ☆172Updated this week
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆95Updated 2 years ago
- ☆63Updated 4 years ago
- 本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。☆231Updated last year
- ☆125Updated 2 years ago
- ☆42Updated 10 months ago
- My study note for mlsys☆15Updated last year
- ☆56Updated last month
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆150Updated 2 weeks ago
- ☆118Updated 10 months ago
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆192Updated last year
- ☆85Updated 2 years ago
- FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang…☆200Updated this week
- Qualcomm Hexagon NN Offload Framework☆45Updated 5 years ago
- A profiler to disclose and quantify hardware features on GPUs.☆175Updated 3 years ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆84Updated 2 years ago
- A converter for llama2.c legacy models to ncnn models.☆79Updated 2 years ago