Zepan / llama.cpp
Port of Facebook's LLaMA model in C/C++
☆13Updated last year
Alternatives and similar repositories for llama.cpp:
Users that are interested in llama.cpp are comparing it to the libraries listed below
- ☆116Updated 9 months ago
- ☆156Updated this week
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆50Updated last week
- Development repository for the Triton language and compiler☆104Updated this week
- The Riallto Open Source Project from AMD☆71Updated 2 months ago
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆115Updated 9 months ago
- ☆18Updated 4 months ago
- GPTQ inference Triton kernel☆292Updated last year
- ☆82Updated last year
- ☆24Updated 10 months ago
- Fork of LLVM to support AMD AIEngine processors☆124Updated this week
- AMD related optimizations for transformer models☆64Updated 2 months ago
- Repository of model demos using TT-Buda☆60Updated last month
- Following the RISC-V IME extension standard, and reusing Vector register resources, these instructions can bring more than a tenfold perf…☆46Updated 5 months ago
- Experiments and prototypes associated with IREE or MLIR☆51Updated 5 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- HeteroCL-MLIR dialect for accelerator design☆41Updated 4 months ago
- A converter and basic tester for rwkv onnx☆42Updated last year
- ☆88Updated this week
- An optimized neural network operator library for chips base on Xuantie CPU.☆87Updated 7 months ago
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆36Updated 3 years ago
- ☆58Updated last year
- TVM for chips base on Xuantie CPU, an open deep learning compiler stack.☆30Updated 7 months ago
- AI applications and tools☆26Updated this week
- muRISCV-NN is a collection of efficient deep learning kernels for embedded platforms and microcontrollers.☆67Updated last week
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆101Updated 3 months ago
- ☆21Updated this week
- ☆34Updated this week
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆80Updated last week
- LLaMa/RWKV onnx models, quantization and testcase☆356Updated last year