Thireus / GGUF-Tool-SuiteLinks
Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowest achievable perplexity for advanced users seeking precise and automated GGUF dynamic quant production.
☆65Updated last week
Alternatives and similar repositories for GGUF-Tool-Suite
Users that are interested in GGUF-Tool-Suite are comparing it to the libraries listed below
Sorting:
- llama.cpp fork with additional SOTA quants and improved performance☆37Updated last week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆84Updated this week
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆154Updated this week
- automatically quant GGUF models☆217Updated last month
- ☆86Updated last week
- ☆125Updated last year
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆29Updated 6 months ago
- ☆107Updated 3 months ago
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆46Updated last month
- A local front-end for open-weight LLMs with memory, RAG, TTS/STT, Elo ratings, and dynamic research tools. Built with React and FastAPI.☆39Updated 3 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆21Updated last week
- Privacy-first agentic framework with powerful reasoning & task automation capabilities. Natively distributed and fully ISO 27XXX complian…☆68Updated 7 months ago
- ☆49Updated last month
- ☆51Updated 9 months ago
- Generate a llama-quantize command to copy the quantization parameters of any GGUF☆27Updated 3 months ago
- InferX: Inference as a Service Platform☆139Updated this week
- ☆135Updated 6 months ago
- Make abliterated models with transformers, easy and fast☆96Updated this week
- Easily view and modify JSON datasets for large language models☆84Updated 6 months ago
- Simple node proxy for llama-server that enables MCP use☆15Updated 6 months ago
- My personal fork of koboldcpp where I hack in experimental samplers.☆44Updated last year
- SoTA open-source TTS☆123Updated last month
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆29Updated this week
- Automated speech dataset creator☆209Updated 5 months ago
- Orpheus Chat WebUI☆75Updated 8 months ago
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆146Updated 4 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆79Updated last week
- NVIDIA Linux open GPU with P2P support☆83Updated 3 weeks ago
- Sparse Inferencing for transformer based LLMs☆213Updated 3 months ago
- win32 native frontend for llama-cli☆12Updated last year