[ICML 2025] CommVQ: Commutative Vector Quantization for KV Cache Compression
☆27Sep 2, 2025Updated 9 months ago
Alternatives and similar repositories for CommVQ
Users that are interested in CommVQ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "SGSS: Streaming 6-DoF Navigation of Gaussian Splat Scenes"☆17Jun 24, 2025Updated 11 months ago
- This repository contains low-bit quantization papers from 2020 to 2025 on top conference.☆169Apr 29, 2026Updated last month
- amrnb codec from 3gpp official website http://www.3gpp.org/DynaReport/26204.htm☆10Apr 30, 2014Updated 12 years ago
- PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"☆13Mar 11, 2026Updated 2 months ago
- TransPimLib is a library for transcendental (and other hard-to-calculate) functions in general-purpose PIM systems, TransPimLib provides …☆15Apr 21, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- Influence-maximization-on-hypergraphs☆14Jun 6, 2022Updated 4 years ago
- ☆18Jun 4, 2025Updated last year
- [ISSTA 2025] A Large-scale Empirical Study on Fine-tuning Large Language Models for Unit Testing☆13Feb 9, 2025Updated last year
- This repository presents the source code for the paper "MILLION: Mastering Long-Context LLM Inference Via Outlier-Immunized KV Product Qu…☆25Apr 2, 2025Updated last year
- Dataset of Codex generated tests for the CodaMosa project☆18Jun 2, 2023Updated 3 years ago
- ☆24Jul 14, 2025Updated 10 months ago
- ☆21Apr 3, 2025Updated last year
- Official code of "ADC-GS: Anchor-Driven Deformable and Compressed Gaussian Splatting for Dynamic Scene Reconstruction", IJCAI2025☆31Mar 10, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ICCV 2025] QuantCache:Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation☆17Sep 26, 2025Updated 8 months ago
- Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference☆19Mar 5, 2023Updated 3 years ago
- [TVLSI 2025] ACiM Inference Simulation Framework in "ASiM: Modeling and Analyzing Inference Accuracy of SRAM-Based Analog CiM Circuits"☆28Sep 9, 2025Updated 8 months ago
- ☆10Sep 26, 2024Updated last year
- A community-driven pypto implementation☆81Jun 2, 2026Updated last week
- [VLDB'23] A Skew-Resistant Index for Processing-in-Memory☆28Jan 5, 2026Updated 5 months ago
- EECS 151/251A FPGA Project Skeleton for Spring 2020☆12May 6, 2020Updated 6 years ago
- This repository covers a wide range of topics including Object-Oriented Programming (OOP), the Standard Template Library (STL), and smart…☆16Mar 14, 2026Updated 2 months ago
- The repository implements the paper "Learning Graph Quantized Tokenizers for Transformers".☆31Apr 2, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference☆20Jun 19, 2025Updated 11 months ago
- A comprehensive and efficient long-context model evaluation framework☆31Feb 25, 2026Updated 3 months ago
- [ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models☆29Aug 5, 2025Updated 10 months ago
- a Computing In Memory emULATOR framework☆16May 19, 2024Updated 2 years ago
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27May 16, 2025Updated last year
- [ICLR 2023] This repository contains the official Pytorch implementation for the paper "Transformer-based model for symbolic regression v…☆29Jul 2, 2025Updated 11 months ago
- Detecting the various emotios in music audio clips using k-NN Classifier Algorithm☆12Jun 7, 2020Updated 6 years ago
- ☆25Apr 3, 2024Updated 2 years ago
- https://xuruowei.com 是她的家人朋友们和她的爱人高策为纪念她留下的。徐若薇于 2026 年 2 月 28 日离世。我们希望通过这个时间线纪念她的一生——照片、故事、文字、音乐与她钟爱的一切。沿着她生命的轨迹漫步,重新触摸那些有温度的瞬间。☆28Apr 1, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official repository of the 3DV 2025 paper "LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming"☆51Updated this week
- evaluate gcc congestion contorl on ns3☆37Oct 11, 2024Updated last year
- Efficient non-uniform quantization with GPTQ for GGUF☆62Sep 17, 2025Updated 8 months ago
- PYNQ bindings for C and C++ to avoid requiring Python or Vitis to execute hardware acceleration.☆31Apr 9, 2026Updated 2 months ago
- Last.fm Dataset - 1K users☆18Aug 24, 2020Updated 5 years ago
- hexo theme☆19Apr 7, 2026Updated 2 months ago
- Simulator for LLM inference on an abstract 3D AIMC-based accelerator☆32Sep 18, 2025Updated 8 months ago