[ICLR 2025π₯] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
β27Jul 7, 2025Updated 8 months ago
Alternatives and similar repositories for D2O
Users that are interested in D2O are comparing it to the libraries listed below
Sorting:
- β39Oct 16, 2025Updated 5 months ago
- β47Nov 25, 2024Updated last year
- β36Mar 17, 2025Updated last year
- Official PyTorch implementation of WPS from our paper: WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Modelsβ14Jun 12, 2025Updated 9 months ago
- Codebase for the ACL 2023 paper: White-Box Multi-Objective Adversarial Attack on Dialogue Generation.β16Dec 8, 2023Updated 2 years ago
- [ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attentionβ53Aug 6, 2025Updated 7 months ago
- The Official Implementation of Ada-KV [NeurIPS 2025]β128Nov 26, 2025Updated 3 months ago
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".β16Sep 15, 2024Updated last year
- This is the repo for constructing a comprehensive and rigorous evaluation framework for LLM calibration.β13Apr 9, 2024Updated last year
- This is a curated semantic version of the PASCAL-Part dataset for part-based object detection. Objects are aligned with WordNet and Yago β¦β14Jan 19, 2022Updated 4 years ago
- β306Jul 10, 2025Updated 8 months ago
- β46Mar 15, 2025Updated last year
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."β18Dec 13, 2024Updated last year
- AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extractionβ17Dec 23, 2021Updated 4 years ago
- Repository of IPBenchβ19Jan 4, 2026Updated 2 months ago
- Awesome-LLM-KV-Cache: A curated list of πAwesome LLM KV Cache Papers with Codes.β418Mar 3, 2025Updated last year
- Code for the paper "Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers"β18Dec 15, 2020Updated 5 years ago
- LongAttn οΌSelecting Long-context Training Data via Token-level Attentionβ15Jul 16, 2025Updated 8 months ago
- Code and data for "Impact of Evaluation Methodologies on Code Summarization" in ACL 2022.β10Sep 6, 2022Updated 3 years ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)β52Dec 17, 2024Updated last year
- Uncertainty-Aware Curriculum Learning for Neural Machine Translation (ACL 2020)β11Jun 12, 2020Updated 5 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.β10May 16, 2024Updated last year
- π° Must-read papers on KV Cache Compression (constantly updating π€).β674Feb 24, 2026Updated 3 weeks ago
- β10Apr 29, 2023Updated 2 years ago
- β23Jul 7, 2023Updated 2 years ago
- AloePlayer: a cross-platform local media player.β17Jan 24, 2026Updated last month
- The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Modelβ¦β15Dec 11, 2023Updated 2 years ago
- An opinionated NLP research templateβ10Aug 29, 2024Updated last year
- β24Jun 7, 2021Updated 4 years ago
- LLM KV cache compression made easyβ971Mar 13, 2026Updated last week
- β20Nov 21, 2025Updated 4 months ago
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generationβ251Dec 16, 2024Updated last year
- Fast and memory-efficient exact attentionβ20Mar 13, 2026Updated last week
- KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024β89Feb 27, 2025Updated last year
- Implementation of 'A Convolutional Attention Network for Extreme Summarization of Source Code'β15Mar 14, 2019Updated 7 years ago
- β13Jul 3, 2024Updated last year
- Code base for the EMNLP 2021 paper, "Multi-granularity Textual Adversarial Attack with Behavior Cloning".β13Apr 18, 2022Updated 3 years ago
- Approximate convex decomposition(ACD)β10Sep 9, 2023Updated 2 years ago
- β16Jun 14, 2024Updated last year