Efficient non-uniform quantization with GPTQ for GGUF
☆63Sep 17, 2025Updated 7 months ago
Alternatives and similar repositories for gptq-gguf-toolkit
Users that are interested in gptq-gguf-toolkit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Llama.cpp-qt is a Python-based GUI wrapper for the LLama.cpp server, providing a user-friendly interface for configuring and running the …☆16Oct 4, 2023Updated 2 years ago
- A chat UI for Llama.cpp☆16Apr 20, 2026Updated 3 weeks ago
- Cuda kernels for leveraging LLM sparsity to improve throughput and decrease the memory requirements during inference and training.☆156Apr 22, 2026Updated 2 weeks ago
- Gradient Descent optimizers for Julia☆12May 26, 2020Updated 5 years ago
- ☆24Jul 14, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆21Apr 3, 2025Updated last year
- ☆14Feb 7, 2024Updated 2 years ago
- [ICCV 2025] QuantCache:Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation☆17Sep 26, 2025Updated 7 months ago
- A vim-like terminal reader to chat with your books