[NAACL 2025π₯] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
β20Jun 19, 2025Updated 10 months ago
Alternatives and similar repositories for MEDA
Users that are interested in MEDA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Modelsβ31Mar 18, 2026Updated last month
- [EMNLP 2024 Findingsπ₯] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inβ¦β104Nov 9, 2024Updated last year
- pytorch-TripletSemiHardLossβ10Jan 12, 2022Updated 4 years ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inferenceβ10Dec 15, 2024Updated last year
- Fast, memory-efficient attention column reduction (e.g., sum, mean, max)β45Feb 10, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Modelsβ42Jan 27, 2026Updated 3 months ago
- Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)β22Aug 28, 2023Updated 2 years ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visioβ¦β45Apr 18, 2025Updated last year
- [EMNLP 2025 Main] SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruningβ40Apr 16, 2026Updated 2 weeks ago
- Source code of paper 'Open Hierarchical Relation Extraction' (NAACL 2021)β22Mar 4, 2022Updated 4 years ago
- Must-read papers on Fine-grained Entity Typingβ19Jul 7, 2022Updated 3 years ago
- Demo for advanced Java final project in 18-19 1 of Canghong Jinβ25Nov 18, 2018Updated 7 years ago
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Modelsβ24Oct 5, 2024Updated last year
- A comprehensive and efficient long-context model evaluation frameworkβ31Feb 25, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identificationβ31Mar 30, 2025Updated last year
- ZORB: A Derivative-Free Backpropagation Algorithm for Neural Networksβ22Nov 17, 2020Updated 5 years ago
- β36Jun 3, 2025Updated 10 months ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''β31Oct 24, 2024Updated last year
- This is the open-source code for TokenCarve.β26Jan 23, 2026Updated 3 months ago
- [ACL'23 Findings] "Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors"β41Dec 22, 2023Updated 2 years ago
- CVPR2024 highlight.β13Oct 10, 2024Updated last year
- AutoHallusion Codebase (EMNLP 2024)β22Dec 6, 2024Updated last year
- Information-Driven Design of Imaging Systemsβ22Mar 26, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3β4Γ reduction in memory and 2Γ decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)β217Feb 11, 2026Updated 2 months ago
- [ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Modelsβ71May 15, 2025Updated 11 months ago
- Official PyTorch implementation of Agglomerative Token Clustering presented at ECCV 2024β20Sep 19, 2024Updated last year
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Followingβ16Oct 31, 2024Updated last year
- β47Nov 25, 2024Updated last year
- This is the source code for the paper "Training CNNs on speckled optical dataset for edge detection in SAR images".β12Mar 12, 2022Updated 4 years ago
- Split-step non-paraxial methodβ16Nov 9, 2025Updated 5 months ago
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformerβ51Sep 6, 2024Updated last year
- X Correlation Forward Scattering Mediaβ10May 11, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ViLoMem: Agentic Learner with Grow-and-Refine Multimodal Semantic Memoryβ61Apr 21, 2026Updated last week
- β11Mar 18, 2022Updated 4 years ago
- code for L2 regularization of arbitrary Tikhonov matricesβ13Mar 16, 2018Updated 8 years ago
- β10Dec 3, 2024Updated last year
- Official repository for "Solving Video Inverse Problems Using Image Diffusion Models"β11Mar 7, 2026Updated last month
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)β34Mar 7, 2025Updated last year
- β16Sep 11, 2025Updated 7 months ago