PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation [NeurIPS 2025]
☆18Oct 11, 2025Updated 5 months ago
Alternatives and similar repositories for PrefixKV
Users that are interested in PrefixKV are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆22Apr 23, 2025Updated 11 months ago
- Official implementation for ICLR 2023 paper Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation☆16Jan 23, 2024Updated 2 years ago
- Pytorch code of [CVPR 2023] "NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction".☆11Mar 14, 2023Updated 3 years ago
- ThinK: Thinner Key Cache by Query-Driven Pruning☆27Feb 11, 2025Updated last year
- Stanford Cars dataset by classes folder☆19Nov 7, 2024Updated last year
- ☆11Sep 9, 2024Updated last year
- [GCPR 2023] UGainS: Uncertainty Guided Anomaly Instance Segmentation☆16Jul 31, 2024Updated last year
- ☆15Sep 14, 2025Updated 6 months ago
- [ECCV 2024] Efficient Inference of Vision Instruction-Following Models with Elastic Cache☆43Jul 26, 2024Updated last year
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 9 months ago
- [AAAI 26'] This is the official pytorch implementation for paper: Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acc…☆59Nov 13, 2025Updated 4 months ago
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆67Feb 18, 2025Updated last year
- Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration☆30Nov 22, 2025Updated 4 months ago
- The official implementation of our ICCV 2023 publication, C-VisDiT☆10Oct 23, 2024Updated last year
- Cross Visual Prompt Tuning [ICCV 2025]☆13Aug 3, 2025Updated 7 months ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- BESA is a differentiable weight pruning technique for large language models.☆17Mar 4, 2024Updated 2 years ago
- ☆17May 2, 2024Updated last year
- i-mae Pytorch Repo☆20Apr 6, 2024Updated last year
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- (WACV'24) Kaizen: Practical self-supervised continual learning with continual fine-tuning☆16Oct 29, 2024Updated last year
- Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ACL-2022☆18May 19, 2022Updated 3 years ago
- [NeurIPS '24] Frustratingly easy Test-Time Adaptation of VLMs!!☆61Mar 24, 2025Updated last year
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation☆23May 29, 2025Updated 9 months ago
- This is the official PyTorch implementation of ASAG (ICCV 2023).☆18Sep 9, 2023Updated 2 years ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated 2 years ago
- ☆10Jul 5, 2023Updated 2 years ago
- channel pruning for accelerating very deep neural networks☆13Mar 8, 2021Updated 5 years ago
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- LLM Quantization toolkit☆19Jul 4, 2025Updated 8 months ago
- ☆13Jun 28, 2021Updated 4 years ago
- Inverted triple Pendulum☆16May 13, 2019Updated 6 years ago
- ☆10Dec 9, 2021Updated 4 years ago
- ☆13May 12, 2025Updated 10 months ago
- [ICML 2024] "Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection"☆15Feb 15, 2025Updated last year
- [ICLR 2025] "Noisy Test-Time Adaptation in Vision-Language Models"☆12Feb 22, 2025Updated last year
- [ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference☆26Jan 27, 2026Updated last month