[NAACL 2025π₯] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
β20Jun 19, 2025Updated 11 months ago
Alternatives and similar repositories for MEDA
Users that are interested in MEDA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Modelsβ30Mar 18, 2026Updated 2 months ago
- [EMNLP 2024 Findingsπ₯] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inβ¦β103Nov 9, 2024Updated last year
- pytorch-TripletSemiHardLossβ10Jan 12, 2022Updated 4 years ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inferenceβ10Dec 15, 2024Updated last year
- Fast, memory-efficient attention column reduction (e.g., sum, mean, max)β46Feb 10, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)β22Aug 28, 2023Updated 2 years ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visioβ¦β44Apr 18, 2025Updated last year
- [EMNLP 2025 Main] SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruningβ44Apr 16, 2026Updated last month
- Source code of paper 'Open Hierarchical Relation Extraction' (NAACL 2021)β22Mar 4, 2022Updated 4 years ago
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Modelsβ24Oct 5, 2024Updated last year
- A comprehensive and efficient long-context model evaluation frameworkβ31Feb 25, 2026Updated 2 months ago
- Demo for advanced Java final project in 18-19 1 of Canghong Jinβ25Nov 18, 2018Updated 7 years ago
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identificationβ31Mar 30, 2025Updated last year
- β35Jun 3, 2025Updated 11 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This is the open-source code for TokenCarve.β26Jan 23, 2026Updated 3 months ago
- [ACL'23 Findings] "Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors"β40Dec 22, 2023Updated 2 years ago
- CVPR2024 highlight.β13Oct 10, 2024Updated last year
- AutoHallusion Codebase (EMNLP 2024)β22Dec 6, 2024Updated last year
- [NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3β4Γ reduction in memory and 2Γ decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)β219Feb 11, 2026Updated 3 months ago
- [ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Modelsβ71May 15, 2025Updated last year
- Official PyTorch implementation of Agglomerative Token Clustering presented at ECCV 2024β20Sep 19, 2024Updated last year
- β47Nov 25, 2024Updated last year
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformerβ50Sep 6, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ViLoMem: Agentic Learner with Grow-and-Refine Multimodal Semantic Memoryβ64Apr 21, 2026Updated last month
- β10Dec 3, 2024Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)β34Mar 7, 2025Updated last year
- β16Sep 11, 2025Updated 8 months ago
- [δΈδΈͺθ倩软仢Demo] a chat software powered by libevent/mysql and qtβ10Sep 10, 2021Updated 4 years ago
- [NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Modelsβ36Nov 10, 2025Updated 6 months ago
- β35Oct 4, 2025Updated 7 months ago
- serverless vscode webideβ17Apr 14, 2023Updated 3 years ago
- EUV Layer Hotspot Detection Benchmark Suitβ20Mar 8, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- β13Jul 3, 2024Updated last year
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"β59Oct 9, 2025Updated 7 months ago
- Project Page for GaussianFormerβ24May 30, 2024Updated last year
- [ACL 2026 Main] Revisit What You See: Revealing Visual Semantics in Vision Tokens to Guide LVLM Decodingβ25Nov 21, 2025Updated 6 months ago
- [AAAI 2026 Oral] HiMo-CLIP: Modeling Semantic Hierarchy and Monotonicity in Vision-Language Alignmentβ30Dec 17, 2025Updated 5 months ago
- β13May 15, 2025Updated last year
- a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarityβ44May 24, 2025Updated 11 months ago