[NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
☆17Jun 19, 2025Updated 8 months ago
Alternatives and similar repositories for MEDA
Users that are interested in MEDA are comparing it to the libraries listed below
Sorting:
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- pytorch-TripletSemiHardLoss☆10Jan 12, 2022Updated 4 years ago
- Fast, memory-efficient attention column reduction (e.g., sum, mean, max)☆37Feb 10, 2026Updated 2 weeks ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆38Jan 27, 2026Updated last month
- A comprehensive and efficient long-context model evaluation framework☆31Updated this week
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆104Nov 9, 2024Updated last year
- [EMNLP 2025 Main] SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning☆34Jan 11, 2026Updated last month