Thireus / GGUF-Tool-SuiteLinks

Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowest achievable perplexity for advanced users seeking precise and automated GGUF dynamic quant production.

☆65

Alternatives and similar repositories for GGUF-Tool-Suite

Users that are interested in GGUF-Tool-Suite are comparing it to the libraries listed below

Sorting:

Thireus / ik_llama.cpp
llama.cpp fork with additional SOTA quants and improved performance
☆37Updated last week
perk11 / large-model-proxy
Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…
☆84Updated this week
Nexesenex / croco.cpp
Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…
☆154Updated this week
leafspark / AutoGGUF
automatically quant GGUF models
☆217Updated last month
theroyallab / YALS
☆86Updated last week
kevkid / gguf_gui
☆125Updated last year
Viceman256 / TensorTune
KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning
☆29Updated 6 months ago
chigkim / Ollama-MMLU-Pro
☆107Updated 3 months ago
jukofyork / transplant-vocab
Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.
☆46Updated last month
boneylizard / Eloquent
A local front-end for open-weight LLMs with memory, RAG, TTS/STT, Elo ratings, and dynamic research tools. Built with React and FastAPI.
☆39Updated 3 months ago
ubergarm / ik_llama.cpp
llama.cpp fork with additional SOTA quants and improved performance
☆21Updated last week
Independent-AI-Labs / local-super-agents
Privacy-first agentic framework with powerful reasoning & task automation capabilities. Natively distributed and fully ISO 27XXX complian…
☆68Updated 7 months ago
k-koehler / gguf-tensor-overrider
☆49Updated last month
rombodawg / Easy_training
☆51Updated 9 months ago
electroglyph / quant_clone
Generate a llama-quantize command to copy the quantization parameters of any GGUF
☆27Updated 3 months ago
inferx-net / inferx
InferX: Inference as a Service Platform
☆139Updated this week
jd-3d / SOLOBench
☆135Updated 6 months ago
Orion-zhen / abliteration
Make abliterated models with transformers, easy and fast
☆96Updated this week
LostRuins / datasetexplorer
Easily view and modify JSON datasets for large language models
☆84Updated 6 months ago
extopico / llama-server_mcp_proxy
Simple node proxy for llama-server that enables MCP use
☆15Updated 6 months ago
kalomaze / koboldcpp
My personal fork of koboldcpp where I hack in experimental samplers.
☆44Updated last year
rsxdalv / chatterbox
SoTA open-source TTS
☆123Updated last month
fidecastro / llama-cpp-connector
Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!
☆29Updated this week
ReisCook / Voice_Extractor
Automated speech dataset creator
☆209Updated 5 months ago
PkmX / orpheus-chat-webui
Orpheus Chat WebUI
☆75Updated 8 months ago
adriancable / qwen3.c
Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.
☆146Updated 4 months ago
SicariusSicariiStuff / SLOP_Detector
SLOP Detector and analyzer based on dictionary for shareGPT JSON and text
☆79Updated last week
aikitoria / open-gpu-kernel-modules
NVIDIA Linux open GPU with P2P support
☆83Updated 3 weeks ago
NimbleEdge / sparse_transformers
Sparse Inferencing for transformer based LLMs
☆213Updated 3 months ago
hasaranga / NativeChat
win32 native frontend for llama-cli
☆12Updated last year