Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model smaller while preserving accuracy.
☆601Feb 23, 2026Updated 2 weeks ago
Alternatives and similar repositories for SINQ
Users that are interested in SINQ are comparing it to the libraries listed below
Sorting:
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆49Oct 29, 2025Updated 4 months ago
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆22Sep 1, 2025Updated 6 months ago
- Two-Step Quantization on AlexNet☆13Jun 29, 2018Updated 7 years ago
- ☆22Aug 9, 2024Updated last year
- ☆21Jan 25, 2025Updated last year
- EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆27Jul 30, 2025Updated 7 months ago
- ☆11Sep 18, 2023Updated 2 years ago
- AI in A Box☆25Feb 23, 2026Updated 2 weeks ago
- With dri3 we can configure in ~/.drirc which GPU a program with a given name should be rendered on. This is a small utlity to make this p…☆10Oct 21, 2016Updated 9 years ago
- ☆11Feb 20, 2025Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Dec 25, 2024Updated last year
- A chat UI for Llama.cpp☆15Dec 2, 2025Updated 3 months ago
- This Streamlit application allows users to upload images and engage in interactive conversations about them using the Ollama Vision Model…☆15Nov 11, 2024Updated last year
- ☆44Oct 28, 2025Updated 4 months ago
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.☆37Jul 2, 2025Updated 8 months ago
- ☆20Oct 6, 2023Updated 2 years ago
- Programming and DevOps assistant tool powered by OpenAI, Antropic, llama.cpp, and other ChatCompletions compatible API providers☆18Feb 19, 2026Updated 2 weeks ago
- ☆21Sep 20, 2025Updated 5 months ago
- Hierarchical roles add-on plugin for Members.☆15Feb 11, 2020Updated 6 years ago
- 🤖 AI-powered CLI for file reorganization. Runs fully locally — no data leaves your machine.☆20Jul 2, 2025Updated 8 months ago
- A c++ framework on efficient training & fine-tuning LLMs☆28Mar 1, 2026Updated last week
- Make new tmux windows and panes inherit the currently active conda environment.☆18Dec 22, 2025Updated 2 months ago
- Lightning Training strategy for HiveMind☆18Jan 20, 2026Updated last month
- Generate a llama-quantize command to copy the quantization parameters of any GGUF☆30Jan 23, 2026Updated last month
- Produce your own Dynamic 3.0 Quants and achieve optimum accuracy & SOTA quantization performance! Input your VRAM and RAM and the toolcha…☆79Feb 22, 2026Updated 2 weeks ago
- ☆19Nov 28, 2024Updated last year
- An fully autonomous agent that accesses the browser and performs tasks.☆17Apr 25, 2025Updated 10 months ago
- An extension for oobabooga/text-generation-webui that automatically unloads and reloads your model.☆17Apr 22, 2024Updated last year
- A Field-Theoretic Approach to Unbounded Memory in Large Language Models☆20Apr 15, 2025Updated 10 months ago
- A go wrapper around the rwkv.cpp library☆20Mar 4, 2024Updated 2 years ago
- A WordPress plugin that adds a button in the editor sidebar to show the raw post data as well as taxonomy and custom field data☆20Nov 19, 2023Updated 2 years ago
- LLM FX: A LLM Server Desktop Client free for everyone!☆37Updated this week
- minimal C implementation of speculative decoding based on llama2.c☆28Jul 15, 2024Updated last year
- Run the beta versions of Elementor from Github.☆19Mar 7, 2018Updated 8 years ago
- A one-file Ollama CLI client written in bash☆30Sep 7, 2025Updated 6 months ago
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆21Jun 11, 2025Updated 8 months ago
- Shell wrapper for the serverpilot.io API https://serverpilot.io/☆19Apr 17, 2018Updated 7 years ago
- InferX: Inference as a Service Platform☆172Updated this week
- [COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"☆72Jul 8, 2025Updated 8 months ago