scrya-com/rotorquant

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrya-com/rotorquant)

scrya-com / rotorquant

KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.

☆1,027

Alternatives and similar repositories for rotorquant

Users that are interested in rotorquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liangbingzhao / PhysicEdit
View on GitHub
[ICML2026] From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors
☆92Apr 30, 2026Updated 2 months ago
Visual-AI / Inpaint4Drag
View on GitHub
[ICCV 2025] Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping
☆94Nov 30, 2025Updated 7 months ago
johndpope / llama-cpp-turboquant
View on GitHub
LLM inference in C/C++
☆63May 7, 2026Updated 2 months ago
TheTom / llama-cpp-turboquant
View on GitHub
LLM inference in C/C++
☆2,097Updated this week
jwkirchenbauer / mtp-lm
View on GitHub
Source code to accompany research paper on training multi token prediction language models using self-distillation.
☆39Feb 21, 2026Updated 4 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
ratson / facebook-py
View on GitHub
☆12Sep 16, 2017Updated 8 years ago
HadarDavidson / colored-noise-sampling
View on GitHub
Official Implementation of "Colored Noise Diffusion Sampling"
☆38Jun 1, 2026Updated last month
sigmod26gem / sigmod26gem
View on GitHub
☆17Mar 13, 2026Updated 4 months ago
taco-group / Pulse-of-Motion
View on GitHub
The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics
☆70Mar 26, 2026Updated 3 months ago
franciszzj / Saber
View on GitHub
[CVPR 2026] Scaling Zero-Shot Reference-to-Video Generation
☆76Apr 28, 2026Updated 2 months ago
iSEE-Laboratory / ProEdit
View on GitHub
Official repository of paper "ProEdit: Inversion-based Editing From Prompts Done Right"
☆116Feb 5, 2026Updated 5 months ago
NolanoOrg / SpectraSuite
View on GitHub
☆54Jul 18, 2024Updated last year
Stability-AI / arbor
View on GitHub
Control 3D Generation with Explicit Geometry
☆97Jun 22, 2026Updated 3 weeks ago
localai-org / apex-quant
View on GitHub
Adaptive Precision for EXpert Models: MoE-aware mixed-precision quantization
☆387May 29, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
EasonXiao-888 / SpatialEdit
View on GitHub
[Official Repo] SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
☆212Apr 13, 2026Updated 3 months ago
bcml-labs / rosa-plus
View on GitHub
ROSA+: RWKV's ROSA implementation with fallback statistical predictor
☆36Oct 13, 2025Updated 9 months ago
GordonChen19 / Prompt-Relay
View on GitHub
An inference-time, plug-and-play method for temporal control in multi-event generation
☆184Apr 26, 2026Updated 2 months ago
Sphere-AI-Lab / diagdistill
View on GitHub
Implementation of <Streaming Autoregressive Video Generation via Diagonal Distillation> in ICLR 2026
☆129Mar 18, 2026Updated 3 months ago
Labman42 / JetEngine
View on GitHub
A lightweight Inference Engine built for block diffusion models
☆47Apr 12, 2026Updated 3 months ago
StoreBlank / online-spatial-intelligence
View on GitHub
☆33Mar 14, 2026Updated 4 months ago
TheTom / turboquant_plus
View on GitHub
☆6,992Jun 26, 2026Updated 2 weeks ago
spiritbuun / buun-llama-cpp
View on GitHub
LLAMA Turboquant implementation with CUDA support
☆687Updated this week
pipecat-ai / pipecat-client-ios
View on GitHub
☆25Jul 2, 2026Updated last week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
VAST-AI-Research / AniGen
View on GitHub
[SIGGRAPH 2026] AniGen: Unified S^3 Fields for Animatable 3D Asset Generation
☆443Updated this week
snu-mllab / GuidedQuant
View on GitHub
Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)
☆53Apr 13, 2026Updated 3 months ago
moonmath-ai / LiteAttention
View on GitHub
Transforming Video Diffusion with Temporal Sparse Attention
☆54Apr 8, 2026Updated 3 months ago
guyyariv / DyPE
View on GitHub
[ICML 2026] Official implementation for "DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion".
☆356May 18, 2026Updated last month
thc1006 / qwen3.6-speculative-decoding-rtx3090
View on GitHub
First public benchmark of llama.cpp speculative decoding on Qwen3.6-35B-A3B with a single RTX 3090 (post PR #19493 merge, 2026-04-19). 19…
☆31May 16, 2026Updated last month
Shredded-Pork / Flash-GRPO
View on GitHub
[ICML 2026] Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization
☆55Jun 11, 2026Updated last month
LLM360 / k2-data-prep
View on GitHub
☆21Jun 4, 2024Updated 2 years ago
Xinxi-Zhang / Re-MeanFlow
View on GitHub
☆48Mar 29, 2026Updated 3 months ago
IST-DASLab / Quartet-II
View on GitHub
Quartet II Official Code
☆75May 1, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
icryo / remove-refusals-with-transformers
View on GitHub
Implements harmful/harmless refusal removal using pure HF Transformers
☆21May 8, 2025Updated last year
Guoxu1233 / DreamID-Omni
View on GitHub
[ICML 2026] DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation
☆272May 22, 2026Updated last month
fredconex / Arandu
View on GitHub
Llama.cpp launcher with integrated huggingface
☆59Jun 25, 2026Updated 2 weeks ago
jxiw / MambaByte
View on GitHub
[CoLM 24] Official Repository of MambaByte: Token-free Selective State Space Model
☆27Oct 12, 2024Updated last year
yehonathanlitman / Lift4D
View on GitHub
Lift4D: Harmonizing Single-View 3D Estimation for 4D Reconstruction In-the-Wild
☆224Jun 23, 2026Updated 3 weeks ago
pryzmatpl / prismalama
View on GitHub
Get up and running with Kimi-K2.5, GLM-4.7, DeepSeek, gpt-oss, Qwen, Gemma and other models.
☆37May 17, 2026Updated last month
emjay73 / InfCam
View on GitHub
☆90May 13, 2026Updated 2 months ago