slaren / llama.cpp
Port of Facebook's LLaMA model in C/C++
☆10Updated last week
Alternatives and similar repositories for llama.cpp:
Users that are interested in llama.cpp are comparing it to the libraries listed below
- Experiments with BitNet inference on CPU☆54Updated last year
- ☆124Updated 10 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆11Updated last year
- TTS support with GGML☆32Updated this week
- Train your own small bitnet model☆70Updated 6 months ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Updated last year
- Inference of Mamba models in pure C☆188Updated last year
- ☆72Updated 5 months ago
- ☆20Updated last month
- Inference RWKV with multiple supported backends.☆43Updated this week
- ☆11Updated last year
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆126Updated 9 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Ex…☆20Updated 2 weeks ago
- RWKV centralised docs for the community☆24Updated last month
- GGML implementation of BERT model with Python bindings and quantization.☆26Updated last year
- ☆30Updated 8 months ago
- Customizable machine translation in C++☆51Updated last year
- A fast RWKV Tokenizer written in Rust☆45Updated last month
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆64Updated this week
- Eh, simple and works.☆27Updated last year
- Profile your CoreML models directly from Python 🐍☆27Updated 6 months ago
- tinygrad port of the RWKV large language model.☆44Updated 2 months ago
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- QuIP quantization☆52Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆137Updated this week
- ☆54Updated 8 months ago
- CPU inference for the DeepSeek family of large language models in C++☆291Updated this week
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆91Updated 2 weeks ago
- ☆124Updated last year