[ICLR2025] Code and data for paper: Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
☆40Mar 10, 2025Updated 11 months ago
Alternatives and similar repositories for HeadKV
Users that are interested in HeadKV are comparing it to the libraries listed below
Sorting:
- The Official Implementation of Ada-KV [NeurIPS 2025]☆128Nov 26, 2025Updated 3 months ago
- ☆35Mar 17, 2025Updated 11 months ago
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- ☆20Jun 17, 2024Updated last year
- The official Github Repo and Download for the FNAF Mod☆10Nov 10, 2015Updated 10 years ago
- CUDA, CuDNN, NVIDIA Driver, and PyTorch Installation for Ubuntu☆12Feb 27, 2025Updated last year
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆36Jul 11, 2024Updated last year
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference☆161Oct 13, 2025Updated 4 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆231Aug 2, 2024Updated last year
- A Forge based Minecraft server-side plugin API☆13Nov 23, 2014Updated 11 years ago
- [ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection☆154Feb 20, 2025Updated last year
- Library of common cryptographic algorithms and functions for Pony☆12Jul 16, 2025Updated 7 months ago
- ☆11May 24, 2024Updated last year
- AllTheModium for Minecraft 1.16+☆17Jan 16, 2026Updated last month
- translation of pi3d from python to rust☆11Jun 28, 2025Updated 8 months ago
- A two-dimensional esoteric programming language, inspired by Hexagony and based on Surface☆11Nov 14, 2019Updated 6 years ago
- A Cydia Repo for iOS tweaks hosted on github.☆13Mar 5, 2018Updated 7 years ago
- Userland and toolchain for seakernel☆13Dec 11, 2015Updated 10 years ago
- Easy, flexible C unit testing☆11Feb 13, 2016Updated 10 years ago
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆57Nov 20, 2024Updated last year
- Simple, straightforward MSP430 disassembler and assembler in Python☆14Jul 5, 2023Updated 2 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- [NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruning☆87Nov 29, 2025Updated 3 months ago
- ☆226Nov 19, 2025Updated 3 months ago
- The historical, initial implementation of an ooc compiler in Java☆115Feb 6, 2013Updated 13 years ago
- ☆14Dec 25, 2024Updated last year
- ooc operating system☆41May 24, 2021Updated 4 years ago
- [satire] A complete language backed by C++☆22May 1, 2015Updated 10 years ago
- Official Implementation (Pytorch) of the "Generative Subgraph Retrieval for Knowledge Graph-Grounded Dialog Generation", EMNLP 2024 (main…☆12Mar 10, 2025Updated 11 months ago
- Repository used for my master's thesis on implementing RVSDG as a dialect of MLIR☆13May 30, 2023Updated 2 years ago
- Omgrofl interpreter☆16Oct 1, 2020Updated 5 years ago
- yet another Minecraft clone☆17Oct 24, 2011Updated 14 years ago
- Call Julia from Rust☆16Dec 8, 2016Updated 9 years ago
- Moved to codeberg.org/derat/nitter-rss-proxy☆11Apr 18, 2023Updated 2 years ago
- An awesome list of reusable chart modules created by the Reuters graphics team☆10Jan 20, 2022Updated 4 years ago
- ☆11Jul 2, 2024Updated last year
- A rust port of https://github.com/charmbracelet/lipgloss☆20Dec 15, 2025Updated 2 months ago
- Experimental Java agent that optimizes ModLauncher/FML/LaunchWrapper☆11Sep 20, 2025Updated 5 months ago
- An operating system.☆30Dec 6, 2017Updated 8 years ago