☆36Mar 17, 2025Updated 11 months ago
Alternatives and similar repositories for cakekv
Users that are interested in cakekv are comparing it to the libraries listed below
Sorting:
- Must-read papers on improving efficiency for LLM serving clusters☆33May 28, 2025Updated 9 months ago
- [ICLR2025] Code and data for paper: Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasonin…☆40Mar 10, 2025Updated 11 months ago
- ☆302Jul 10, 2025Updated 7 months ago
- ☆16Apr 15, 2025Updated 10 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 9 months ago
- The Official Implementation of Ada-KV [NeurIPS 2025]☆128Nov 26, 2025Updated 3 months ago
- [ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models☆27Jul 7, 2025Updated 8 months ago
- official code for GliDe with a CaPE☆20Aug 13, 2024Updated last year
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆72Sep 18, 2025Updated 5 months ago
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆664Feb 24, 2026Updated last week
- ABSTRACT: In this paper, a two-stage grid connected photovoltaic system present which consists of inverter and dc-dc converter (Boost con…☆11Sep 15, 2021Updated 4 years ago
- Repository of IPBench☆19Jan 4, 2026Updated 2 months ago
- Official Implementation for [ICLR26] DefensiveKV: Taming the Fragility of KV Cache Eviction in LLM Inference☆22Feb 9, 2026Updated last month
- AI Hedge Fund Repo integrate with DeepSeek V3 and R1 hosted on SiliconFlow.☆12Feb 3, 2025Updated last year
- Hardware implementation of a Fixed Point Recursive Forward and Inverse FFT algorithm☆16Mar 3, 2018Updated 8 years ago
- This project aims at predicting correlated column pairs in data tables by analyzing column names via large language models.☆11Aug 21, 2023Updated 2 years ago
- SystemVerilog examples for a digital design course☆13Mar 30, 2021Updated 4 years ago
- ☆17Apr 15, 2025Updated 10 months ago
- llvmのAZ Processor Backend☆11Oct 29, 2013Updated 12 years ago
- Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)☆11Jun 20, 2025Updated 8 months ago
- ☆10Aug 20, 2023Updated 2 years ago
- ☆13Jan 7, 2025Updated last year
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 3 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- ☆47Nov 25, 2024Updated last year
- ☆17Jun 19, 2021Updated 4 years ago
- An official repository for GPTailor☆17Jun 29, 2025Updated 8 months ago
- linux 内核技术文档☆16Feb 26, 2026Updated last week
- ☆11Jun 10, 2022Updated 3 years ago
- ☆12Feb 28, 2025Updated last year
- original 8bit CPU of ICF3-Z☆12Feb 20, 2020Updated 6 years ago
- Human Resource Management App☆15Aug 5, 2016Updated 9 years ago
- ☆15Apr 11, 2024Updated last year
- a simple new ISA nnISA and nnSOC nnCPU nnAs nnCc☆10Mar 15, 2020Updated 5 years ago
- ☆12Feb 15, 2023Updated 3 years ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 9 months ago
- The official implementation of "Test-time Adaptation for Regression by Subspace Alignment" (ICLR 2025).☆15Jun 6, 2025Updated 9 months ago
- ☆20Apr 18, 2024Updated last year
- [ICML 2024] VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling☆10Sep 22, 2024Updated last year