Zepan / llama.cpp

Port of Facebook's LLaMA model in C/C++

☆13

Alternatives and similar repositories for llama.cpp:

Users that are interested in llama.cpp are comparing it to the libraries listed below

mlc-ai / llm-perf-bench
☆116Updated 9 months ago
mlc-ai / relax
☆156Updated this week
amd / ryzen-ai-documentation
Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…
☆50Updated last week
ROCm / triton
Development repository for the Triton language and compiler
☆104Updated this week
AMDResearch / Riallto
The Riallto Open Source Project from AMD
☆71Updated 2 months ago
rejunity / tiny-asic-1_58bit-matrix-mul
Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit
☆115Updated 9 months ago
scalable-analyses / sme
☆18Updated 4 months ago
fpgaminer / GPTQ-triton
GPTQ inference Triton kernel
☆292Updated last year
HLSTransform / submission
☆82Updated last year
tonyzhang617 / nomad-dist
☆24Updated 10 months ago
Xilinx / llvm-aie
Fork of LLVM to support AMD AIEngine processors
☆124Updated this week
huggingface / optimum-amd
AMD related optimizations for transformer models
☆64Updated 2 months ago
tenstorrent / tt-buda-demos
Repository of model demos using TT-Buda
☆60Updated last month
space-mit / riscv-ime-extension-spec
Following the RISC-V IME extension standard, and reusing Vector register resources, these instructions can bring more than a tenfold perf…
☆46Updated 5 months ago
iree-org / iree-experimental
Experiments and prototypes associated with IREE or MLIR
☆51Updated 5 months ago
EmbeddedLLM / vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
☆88Updated this week
cornell-zhang / hcl-dialect
HeteroCL-MLIR dialect for accelerator design
☆41Updated 4 months ago
RWKV / rwkv-onnx
A converter and basic tester for rwkv onnx
☆42Updated last year
Xilinx / mlir-air
☆88Updated this week
XUANTIE-RV / csi-nn2
An optimized neural network operator library for chips base on Xuantie CPU.
☆87Updated 7 months ago
vortexgpgpu / NVPTX-SPIRV-Translator
The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.
☆36Updated 3 years ago
amd / UIF
☆58Updated last year
XUANTIE-RV / tvm
TVM for chips base on Xuantie CPU, an open deep learning compiler stack.
☆30Updated 7 months ago
xmos / ai_tools
AI applications and tools
☆26Updated this week
tum-ei-eda / muriscv-nn
muRISCV-NN is a collection of efficient deep learning kernels for embedded platforms and microcontrollers.
☆67Updated last week
GATECH-EIC / ShiftAddLLM
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
☆101Updated 3 months ago
mlc-ai / mlc-python
☆21Updated this week
intel / torch-xpu-ops
☆34Updated this week
marty1885 / llama.cpp
My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend
☆80Updated last week
tpoisonooo / llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
☆356Updated last year