KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.
☆653Apr 3, 2026Updated 2 weeks ago
Alternatives and similar repositories for rotorquant
Users that are interested in rotorquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors☆84Mar 7, 2026Updated last month
- [ICCV 2025] Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping☆93Nov 30, 2025Updated 4 months ago
- Animate Any Character in Any World☆97Mar 10, 2026Updated last month
- Source code to accompany research paper on training multi token prediction language models using self-distillation.☆33Feb 21, 2026Updated last month
- [CVPR 2026] Scaling Zero-Shot Reference-to-Video Generation☆69Dec 11, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Llama.cpp launcher with integrated huggingface☆47Mar 23, 2026Updated 3 weeks ago
- Implementation of <Streaming Autoregressive Video Generation via Diagonal Distillation> in ICLR 2026☆120Mar 18, 2026Updated last month
- Neo4j Cypher Documentation☆19Updated this week
- Official repository of paper "ProEdit: Inversion-based Editing From Prompts Done Right"☆116Feb 5, 2026Updated 2 months ago
- [CVPR 2026] 👋 Dataset and Benchmark code for EgoEdit☆138Apr 5, 2026Updated 2 weeks ago
- Get aid from local LLMs right in your PowerShell☆16May 2, 2025Updated 11 months ago
- A collection of outcomes and discoveries from our legal AI research projects☆25Apr 10, 2026Updated last week
- [Arxiv 2026] ActionPlan: Future-Aware Streaming Motion Synthesis via Frame-Level Action Planning☆74Mar 26, 2026Updated 3 weeks ago
- Modular task agnostic training pipeline using LFM2 from Liquid AI with unsloth.☆16Sep 13, 2025Updated 7 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ROSA+: RWKV's ROSA implementation with fallback statistical predictor☆34Oct 13, 2025Updated 6 months ago
- Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)☆51Updated this week
- DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation☆247Mar 25, 2026Updated 3 weeks ago
- Ultra-Sparse Adaptation of 1-Bit LLMs via XOR Patches☆64Apr 10, 2026Updated last week
- ☆33Nov 18, 2025Updated 5 months ago
- ☆21Jun 4, 2024Updated last year
- ☆88Feb 4, 2026Updated 2 months ago
- [CoLM 24] Official Repository of MambaByte: Token-free Selective State Space Model☆25Oct 12, 2024Updated last year
- Code for setting up an MCP-based PR review agent using your favorite LLM☆16Feb 5, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [CVPR'26] VecGlypher: Unified Vector Glyph Generation with Language Models☆118Feb 26, 2026Updated last month
- ☆60Jul 4, 2024Updated last year
- D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI [ICLR 2026]☆77Mar 3, 2026Updated last month
- [ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models☆93Sep 11, 2025Updated 7 months ago
- ☆49May 20, 2025Updated 10 months ago
- [3DV 2026 Oral] VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space☆222Nov 25, 2025Updated 4 months ago
- Easily explore, manage, and interact with your local Ollama models.☆12Jul 18, 2024Updated last year
- ☆34Mar 19, 2026Updated last month
- REAP: Router-weighted Expert Activation Pruning for SMoE compression☆334Apr 8, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆47Mar 29, 2026Updated 2 weeks ago
- [AAAI 2026] UltraGen☆78Feb 1, 2026Updated 2 months ago
- Tensor library for machine learning☆17Jul 13, 2023Updated 2 years ago
- Official implementation for "Story2Board: A Training‑Free Approach for Expressive Storyboard Generation"☆245Aug 22, 2025Updated 7 months ago
- ToonOut, a fork of BiRefNet focused on background removal for anime images. We open-source our dataset & our weights. See our paper at: h…☆91Sep 10, 2025Updated 7 months ago
- NeuroBLAST v3 architecture code☆37Jan 6, 2026Updated 3 months ago
- A Model Context Protocol server for Python code analysis with Claude. Again, works with warning now. I'm missing something here.☆11Nov 29, 2025Updated 4 months ago