xhedit / quantkitLinks

cli tool to quantize gguf, gptq, awq, hqq and exl2 models

☆73

Alternatives and similar repositories for quantkit

Users that are interested in quantkit are comparing it to the libraries listed below

Sorting:

tdrussell / qlora-pipe
A pipeline parallel training script for LLMs.
☆153Updated 2 months ago
LostRuins / datasetexplorer
Easily view and modify JSON datasets for large language models
☆78Updated 2 months ago
jukofyork / transplant-vocab
Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.
☆31Updated 3 months ago
nath1295 / LLMFlex
A python package for developing AI applications with local LLMs.
☆150Updated 6 months ago
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆173Updated last year
mounta11n / plusplus-camall
After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…
☆54Updated 10 months ago
av / klmbr
klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs
☆78Updated 9 months ago
the-crypt-keeper / tcurtsni
Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?
☆22Updated last year
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆88Updated 2 months ago
Lizonghang / TPI-LLM
TPI-LLM: Serving 70b-scale LLMs Efficiently on Low-resource Edge Devices
☆185Updated last month
Aesthisia / LLMinator
Gradio based tool to run opensource LLM models directly from Huggingface
☆93Updated last year
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆26Updated 8 months ago
Gryphe / MergeMonster
An unsupervised model merging algorithm for Transformers-based language models.
☆105Updated last year
perk11 / large-model-proxy
Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…
☆67Updated 2 weeks ago
chigkim / Ollama-MMLU-Pro
☆95Updated 6 months ago
QuixiAI / kraken
☆66Updated last year
fairydreaming / llama.cpp
LLM inference in C/C++
☆21Updated 3 months ago
leafspark / AutoGGUF
automatically quant GGUF models
☆187Updated this week
teknium1 / ShareGPT-Builder
☆115Updated 6 months ago
jabberjabberjabber / Chunkify
Create text chunks which end at natural stopping points without using a tokenizer
☆25Updated 4 months ago
bdambrosio / AllTheWorldAPlay
All the world is a play, we are but actors in it.
☆50Updated this week
mzbac / mlx_sharding
Distributed Inference for mlx LLm
☆93Updated 11 months ago
Uminosachi / open-llm-webui
This repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).
☆43Updated 3 months ago
nyunAI / PruneGPT
☆52Updated last year
fairydreaming / farel-bench
Testing LLM reasoning abilities with family relationship quizzes.
☆62Updated 5 months ago
latent-variable / r1_reasoning_effort
Forces DeepSeek R1 models to engage in extended reasoning by intercepting early termination tokens.
☆19Updated 5 months ago
severian42 / Computational-Model-for-Symbolic-Representations
Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …
☆49Updated 5 months ago
rombodawg / Easy_training
☆49Updated 4 months ago
cognitivecomputations / OpenChatML
☆157Updated last year
mzbac / mlx-llm-server
For inferring and serving local LLMs using the MLX framework
☆105Updated last year