huawei-csl / SINQView external linksLinks
Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model smaller while preserving accuracy.
☆595Updated this week
Alternatives and similar repositories for SINQ
Users that are interested in SINQ are comparing it to the libraries listed below
Sorting:
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆49Oct 29, 2025Updated 3 months ago
- Two-Step Quantization on AlexNet☆13Jun 29, 2018Updated 7 years ago
- ☆22Aug 9, 2024Updated last year
- ☆20Jan 25, 2025Updated last year
- AI in A Box☆25Jan 20, 2026Updated 3 weeks ago
- Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.☆22Nov 26, 2025Updated 2 months ago
- EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆27Jul 30, 2025Updated 6 months ago
- ☆11Sep 18, 2023Updated 2 years ago
- A chat UI for Llama.cpp☆15Dec 2, 2025Updated 2 months ago
- ☆11Feb 20, 2025Updated 11 months ago
- Yet Another (LLM) Web UI, made with Gemini☆12Dec 25, 2024Updated last year
- Home of ALP/GraphBLAS and ALP/Pregel, featuring shared- and distributed-memory auto-parallelisation of linear algebraic and vertex-centri…☆33Updated this week
- See vLLM official support: https://github.com/vllm-project/vllm-ascend☆11Feb 5, 2025Updated last year
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.☆36Jul 2, 2025Updated 7 months ago
- This Streamlit application allows users to upload images and engage in interactive conversations about them using the Ollama Vision Model…☆15Nov 11, 2024Updated last year
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆17Apr 16, 2025Updated 9 months ago
- 🤖 AI-powered CLI for file reorganization. Runs fully locally — no data leaves your machine.☆19Jul 2, 2025Updated 7 months ago
- Watch for file changes and auto restart an application using fork checkpoints to continue the process (for quick live development)☆13Dec 30, 2021Updated 4 years ago
- Programming and DevOps assistant tool powered by OpenAI, Antropic and llama.cpp☆16Nov 1, 2023Updated 2 years ago
- ☆20Oct 6, 2023Updated 2 years ago
- This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"☆118Oct 15, 2025Updated 3 months ago
- Make new tmux windows and panes inherit the currently active conda environment.☆17Dec 22, 2025Updated last month
- Hierarchical roles add-on plugin for Members.☆15Feb 11, 2020Updated 6 years ago
- Lightning Training strategy for HiveMind☆18Jan 20, 2026Updated 3 weeks ago
- Script-based task scheduler with scalable architecture, and integrated dependency management.☆16Nov 5, 2023Updated 2 years ago
- A complete end-to-end pipeline for training specialized Small Language Models (SLMs) on custom business data. OTTO enables organizations …☆34Oct 1, 2025Updated 4 months ago
- World's most accurate password guessing AI tool. A PyTorch implementation of PassLLM (USENIX 2025) that leverages PII and LoRA fine-tunin…☆38Feb 4, 2026Updated last week
- MLIR tools and dialect for GraphBLAS☆18Mar 30, 2022Updated 3 years ago
- A c++ framework on efficient training & fine-tuning LLMs☆27Updated this week
- Efficient Decision tree Ensembles library for IoT edge nodes☆16Jan 29, 2025Updated last year
- An extension for oobabooga/text-generation-webui that automatically unloads and reloads your model.☆17Apr 22, 2024Updated last year
- An fully autonomous agent that accesses the browser and performs tasks.☆17Apr 25, 2025Updated 9 months ago
- An agentic based project design and coding orchestrator for both on-prem and off-prem development☆66Updated this week
- Cross-Platform High-Level LLM Library☆43Feb 5, 2026Updated last week
- A WordPress plugin that adds a button in the editor sidebar to show the raw post data as well as taxonomy and custom field data☆20Nov 19, 2023Updated 2 years ago
- Turn any Kiwix ZIM archive (offline Wikipedia, Stack Exchange, DevDocs, etc.) into an instant knowledge source for LLMs with a tiny CLI +…☆73Jun 4, 2025Updated 8 months ago
- Shell wrapper for the serverpilot.io API https://serverpilot.io/☆19Apr 17, 2018Updated 7 years ago
- minimal C implementation of speculative decoding based on llama2.c☆25Jul 15, 2024Updated last year
- Agentic RAG to help you build a startup🚀☆55Apr 5, 2025Updated 10 months ago