[ICLR 2025๐ฅ] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
โ27Jul 7, 2025Updated 11 months ago
Alternatives and similar repositories for D2O
Users that are interested in D2O are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inferenceโ50Jun 19, 2024Updated last year
- โ46Oct 16, 2025Updated 7 months ago
- โ38Mar 17, 2025Updated last year
- Codebase for the ACL 2023 paper: White-Box Multi-Objective Adversarial Attack on Dialogue Generation.โ16Dec 8, 2023Updated 2 years ago
- The Official Implementation of Ada-KV [NeurIPS 2025]โ134Nov 26, 2025Updated 6 months ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".โ17Sep 15, 2024Updated last year
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.โ51Oct 18, 2024Updated last year
- This is a curated semantic version of the PASCAL-Part dataset for part-based object detection. Objects are aligned with WordNet and Yago โฆโ14Jan 19, 2022Updated 4 years ago
- โ318Jul 10, 2025Updated 11 months ago
- โ47Mar 15, 2025Updated last year
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."โ18Dec 13, 2024Updated last year
- ๅไบซไธไบS2Sๅจๅฎ้ ๅบ็จไธญ้ๅฐ็้ฎ้ขๅ่งฃๅณๆนๆณใโ28Aug 3, 2020Updated 5 years ago
- AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extractionโ17Dec 23, 2021Updated 4 years ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)โ34Mar 7, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Game UI Glitch Detection via Bug Understandingโ12Jul 31, 2021Updated 4 years ago
- Code and data for "Impact of Evaluation Methodologies on Code Summarization" in ACL 2022.โ10Sep 6, 2022Updated 3 years ago
- Uncertainty-Aware Curriculum Learning for Neural Machine Translation (ACL 2020)โ11Jun 12, 2020Updated 5 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.โ10May 16, 2024Updated 2 years ago
- โ12Jul 4, 2020Updated 5 years ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)โ53Dec 17, 2024Updated last year
- ๐ฐ Must-read papers on KV Cache Compression (constantly updating ๐ค).โ713Apr 15, 2026Updated last month
- [EMNLP25] Official code for "POSITION BIAS MITIGATES POSITION BIAS: Mitigate Position Bias Through Inter-Position Knowledge Distillationโฆโ38Nov 11, 2025Updated 7 months ago
- โ10Apr 29, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer โข AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- The codebase for paper "PPT: Token Pruning and Pooling for Efficient Vision Transformer"โ29Nov 17, 2024Updated last year
- Template for Makefile based SysY compiler projects.โ11Jun 16, 2022Updated 3 years ago
- AloePlayer: a cross-platform local media player.โ17Jan 24, 2026Updated 4 months ago
- โ24Jun 7, 2021Updated 5 years ago
- โ34Sep 19, 2025Updated 8 months ago
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Modelsโ30May 27, 2026Updated 2 weeks ago
- โ10Dec 3, 2024Updated last year
- โ17Sep 11, 2025Updated 9 months ago
- LLM KV cache compression made easyโ1,104Jun 4, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generationโ254Dec 16, 2024Updated last year
- Fast and memory-efficient exact attentionโ21Apr 10, 2026Updated 2 months ago
- KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024โ90Feb 27, 2025Updated last year
- Benchmarking Social Intelligence of Language Agents through Interactive Scenariosโ13Jan 4, 2025Updated last year
- Implementation of 'A Convolutional Attention Network for Extreme Summarization of Source Code'โ15Mar 14, 2019Updated 7 years ago
- Adversarial Robustness for Codeโ16Mar 30, 2021Updated 5 years ago
- Source Code for Online Collective Matrix Factorization Hashing. Reference: Di Wang, Quan Wang, Yaqiang An, Xinbo Gao, and Yumin Tian. 202โฆโ11Oct 20, 2020Updated 5 years ago