ubergarm / ik_llama.cppLinks
llama.cpp fork with additional SOTA quants and improved performance
☆21Updated this week
Alternatives and similar repositories for ik_llama.cpp
Users that are interested in ik_llama.cpp are comparing it to the libraries listed below
Sorting:
- ☆90Updated last month
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆626Updated last week
- Produce your own Dynamic 3.0 Quants and achieve optimum accuracy & SOTA quantization performance! Input your VRAM and RAM and the toolcha…☆76Updated this week
- ☆27Updated 7 months ago
- Autonomous, agentic, creative story writing system that incorporates stored embeddings and Knowledge Graphs.☆92Updated this week
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.☆295Updated this week
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30Updated 8 months ago
- The most feature-complete local AI workstation. Multi-GPU inference, integrated Stable Diffusion + ADetailer, voice cloning, research-gra…☆55Updated this week
- ☆230Updated 9 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆88Updated last week
- A persistent local memory for AI, LLMs, or Copilot in VS Code.☆191Updated 3 months ago
- ☆205Updated 5 months ago
- Local LLM Powered Recursive Search & Smart Knowledge Explorer☆260Updated 3 months ago
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆156Updated this week
- Orpheus Chat WebUI☆76Updated 10 months ago
- ☆178Updated 5 months ago
- The easiest & fastest way to run LLMs in your home lab☆80Updated 2 months ago
- Convert downloaded Ollama models back into their GGUF equivalent format☆71Updated last year
- fully local, temporally aware natural language file search on your pc! even without a GPU. find relevant files using natural language i…☆166Updated last month
- ☆51Updated 11 months ago
- A local AI companion that uses a collection of free, open source AI models in order to create two virtual companions that will follow you…☆240Updated 3 months ago
- ☆51Updated 3 months ago
- ☆109Updated 5 months ago
- automatically quant GGUF models☆219Updated last month
- A Conversational Speech Generation Model with Gradio UI and OpenAI compatible API. UI and API support CUDA, MLX and CPU devices.☆211Updated 9 months ago
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆157Updated 7 months ago
- A fork of vLLM enabling Pascal architecture GPUs☆32Updated 11 months ago
- Open source LLM UI, compatible with all local LLM providers.☆177Updated last year
- InferX: Inference as a Service Platform☆156Updated this week
- Docs for GGUF quantization (unofficial)☆366Updated 6 months ago