electroglyph/quant_clone

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/electroglyph/quant_clone)

electroglyph / quant_clone

Generate a llama-quantize command to copy the quantization parameters of any GGUF

☆34

Alternatives and similar repositories for quant_clone

Users that are interested in quant_clone are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

gruai / koifish
View on GitHub
Sparse & quantized LLM training/inference/CPT/SFT/DPO
☆29Jul 7, 2026Updated last week
gigit0000 / qwen3.c
View on GitHub
Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.
☆25Sep 1, 2025Updated 10 months ago
heiervang-technologies / ht-vllm-omni
View on GitHub
A framework for efficient model inference with omni-modality models
☆29Jul 12, 2026Updated last week
hasaranga / NativeChat
View on GitHub
win32 native frontend for llama-cli
☆14Nov 2, 2024Updated last year
Thrasher-Software / sigil
View on GitHub
A local-first LLM development studio. Build, test, and customize inference workflows with your own models — no cloud, totally local.
☆17May 21, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
thad0ctor / KrunchWrapper
View on GitHub
☆18Jul 1, 2025Updated last year
gigit0000 / qwen3.cu
View on GitHub
Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.
☆24Nov 26, 2025Updated 7 months ago
zeropointnine / tts-toy
View on GitHub
Chatbot-to-speech using Orpheus TTS model. Interactive console app.
☆21May 1, 2025Updated last year
Green0-0 / llm_datasets
View on GitHub
A collection of high quality huggingface datasets.
☆29Apr 19, 2026Updated 3 months ago
juzi5201314 / RepoExplainer
View on GitHub
An AI tool designed to generate explanations for every file in a project
☆15Mar 7, 2025Updated last year
Magnetron85 / PyChat
View on GitHub
Python language chat with Ollama models locally, anthropic and openai
☆24Mar 5, 2026Updated 4 months ago
remichu-ai / gallamaUI
View on GitHub
☆23May 14, 2026Updated 2 months ago
RhinoDevel / mt_llm
View on GitHub
Pure C wrapper library to use llama.cpp with Linux and Windows as simple as possible.
☆15Jul 11, 2026Updated last week
fishiatee / yawullm
View on GitHub
Yet Another (LLM) Web UI, made with Gemini
☆12Dec 25, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
rhulha / Speech2Speech
View on GitHub
A web application that converts speech to speech 100% private
☆86Jun 3, 2025Updated last year
restyler / poor-mans-lovable
View on GitHub
A simple CLI app which allows you to generate and deploy simple apps. MVP.
☆21Aug 4, 2025Updated 11 months ago
TAR-ALEX / llm-html
View on GitHub
☆20Jul 4, 2025Updated last year
clowerweb / kitten-tts-web-demo
View on GitHub
Kitten TTS web demo using tansformers.js
☆100Aug 13, 2025Updated 11 months ago
yuzhenmao / IceCache
View on GitHub
Implementation for IceCache: Memory-Efficient KV-cache Management for Long-Sequence LLMs (ICLR 2026).
☆19Jun 9, 2026Updated last month
icryo / remove-refusals-with-transformers
View on GitHub
Implements harmful/harmless refusal removal using pure HF Transformers
☆23May 8, 2025Updated last year
pralab / som-refusal-directions
View on GitHub
☆31Mar 24, 2026Updated 3 months ago
zackshen / gguf
View on GitHub
a GGUF file parser
☆21Jun 29, 2026Updated 3 weeks ago
blindTissue / logit_lens_llama_advanced
View on GitHub
☆18Jun 22, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
rhulha / EchoMate
View on GitHub
A web application that converts speech to speech 100% private using VAD (voice activity detection)
☆18Aug 17, 2025Updated 11 months ago
ThomasVuNguyen / K
View on GitHub
Developing K - a language model to generate OPENSCAD code from prompt
☆19Dec 3, 2025Updated 7 months ago
Debrup-61 / RaDeR
View on GitHub
Official Code Repositiry for "RaDeR: Reasoning-aware Dense Retrieval Models" accepted at Main Conference EMNLP 2025
☆18Jun 23, 2025Updated last year
dokasto / Saidia
View on GitHub
Offline-first, desktop AI assistant tailored for educators, enabling them to generate questions directly from source materials.
☆24Aug 2, 2025Updated 11 months ago
shagunmistry / NotebookLM_Alternative
View on GitHub
☆21Dec 22, 2024Updated last year
Unmortan-Ellary / Vascura-FRONT
View on GitHub
Bloat Free, Portable and Lightweight LLM Frontend (Single HTML file). With Lorebook, Web Search, Macro Engine etc.
☆22Updated this week
DunZhang / Jasper-Token-Compression-Training
View on GitHub
The training codes of Jasper-Token-Compression-600M
☆20Nov 19, 2025Updated 8 months ago
sukanto-m / directory-monitor
View on GitHub
☆16Oct 28, 2025Updated 8 months ago
chigkim / Ollama-MMLU-Pro
View on GitHub
☆111Aug 21, 2025Updated 10 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
moaljumaa / halfwayml_open
View on GitHub
Open source tool for transcirption and subtitling, alternative to happyscribe.
☆36Feb 12, 2025Updated last year
0xClandestine / mirror-sd
View on GitHub
DFlash block-diffusion speculative decoding running on Apple Silicon via MLX, with an ANE execution path that explores heterogeneous acce…
☆62Apr 18, 2026Updated 3 months ago
severian42 / SIREN
View on GitHub
A Field-Theoretic Approach to Unbounded Memory in Large Language Models
☆20Apr 15, 2025Updated last year
fidecastro / llama-cpp-connector
View on GitHub
Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!
☆31Dec 11, 2025Updated 7 months ago
airnsk / proxycache
View on GitHub
Smart OpenAI‑compatible proxy for llama.cpp: manages slots, saves/restores KV cache to disk, routes requests by prefix similarity, and pr…
☆48Nov 14, 2025Updated 8 months ago
nytopop / illu
View on GitHub
realtime conversational dynamics
☆19Mar 19, 2025Updated last year
ianling / steg-experiments
View on GitHub
Experiments that involve encoding arbitrary data into video files that survive Youtube compression.
☆19Sep 2, 2025Updated 10 months ago