hjc4869 / llama.cpp
LLM inference in C/C++
☆13Updated 2 weeks ago
Alternatives and similar repositories for llama.cpp:
Users that are interested in llama.cpp are comparing it to the libraries listed below
- automatically quant GGUF models☆164Updated last week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆55Updated last month
- LLM inference in C/C++☆67Updated this week
- 8-bit CUDA functions for PyTorch Rocm compatible☆39Updated last year
- Core, Junction, and VRAM temperature reader for Linux + GDDR6/GDDR6X GPUs☆37Updated 3 months ago
- Easily view and modify JSON datasets for large language models☆71Updated 3 weeks ago
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆115Updated 10 months ago
- ☆52Updated this week
- A pipeline parallel training script for LLMs.☆136Updated last week
- ☆40Updated this week
- 8-bit CUDA functions for PyTorch☆45Updated last month
- ☆81Updated 2 weeks ago
- LM inference server implementation based on *.cpp.☆154Updated this week
- Croco.Cpp is a 3rd party testground for KoboldCPP, a simple one-file way to run various GGML/GGUF models with KoboldAI's UI. (for Croco.C…☆100Updated this week
- GPU Power and Performance Manager☆57Updated 5 months ago
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆22Updated this week
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆127Updated 6 months ago
- Stable Diffusion and Flux in pure C/C++☆13Updated this week
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆200Updated last month
- ☆83Updated 3 months ago
- Lightweight Inference server for OpenVINO☆143Updated this week
- Fast and memory-efficient exact attention☆163Updated this week
- llama.cpp to PyTorch Converter☆33Updated 11 months ago
- Deepspeed windows information☆37Updated last year
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 3 months ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆86Updated 7 months ago
- An interface that features barely zero external dependencies beyond the Ollama API itself, making it lightweight and portable to easily i…☆12Updated this week
- llama.cpp fork with additional SOTA quants and improved performance☆222Updated this week
- ☆21Updated 5 months ago
- My personal fork of koboldcpp where I hack in experimental samplers.☆44Updated 10 months ago