Official implementation of "TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization" (Findings of ACL 2025).
☆21Jul 25, 2025Updated 10 months ago
Alternatives and similar repositories for TailorKV
Users that are interested in TailorKV are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Apr 17, 2025Updated last year
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning☆13Sep 2, 2024Updated last year
- [SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference☆91Dec 7, 2025Updated 6 months ago
- This repository presents the source code for the paper "MILLION: Mastering Long-Context LLM Inference Via Outlier-Immunized KV Product Qu…☆25Apr 2, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆59Nov 20, 2024Updated last year
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆53Dec 17, 2024Updated last year
- Benchmarking Multi-Step Spatial Reasoning in MLLMs with LEGO-based VQA & generation tasks.☆37Jun 20, 2025Updated 11 months ago
- ☆11Apr 4, 2022Updated 4 years ago
- InstAttention: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference☆17Mar 30, 2025Updated last year
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆30Dec 2, 2025Updated 6 months ago
- Code for Federated Generalized Bayesian Learning via Distributed Stein Variational Gradient Descent☆10Nov 19, 2020Updated 5 years ago
- Implementation for Phenotype prediction from single-cell RNA-seq data using attention-based neural networks (Bioinformatics 2024).☆13Jul 15, 2024Updated last year
- ☆53May 13, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆11Oct 31, 2021Updated 4 years ago
- Source code for ComNet paper: Satellite multi-beam multicast support for an efficient community-based CDN☆10Jul 26, 2022Updated 3 years ago
- Ever wondered how popular your GitHub repo is compared to others?☆19Feb 14, 2026Updated 4 months ago
- [ICLR 2025] SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction☆41Mar 24, 2025Updated last year
- ☆11Sep 6, 2024Updated last year
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆18Oct 31, 2024Updated last year
- Matlab interface for SCS☆14May 11, 2026Updated last month
- ☆12Jul 16, 2020Updated 5 years ago
- ☆10Dec 13, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆20Jun 17, 2024Updated last year
- [CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image…☆84Feb 26, 2026Updated 3 months ago
- Matlab scripts for the paper "Machine Learning meets Stochastic Geometry: Determinantal Subset Selection for Wireless Networks"☆12May 4, 2019Updated 7 years ago
- ☆320Jul 10, 2025Updated 11 months ago
- ☆11Dec 4, 2025Updated 6 months ago
- LLM-Based Multi-Agent Situation Awareness☆18Jun 4, 2026Updated last week
- This is a code package is related to the follow scientific article: Andrea Pizzo, Daniel Verenzuela, Luca Sanguinetti and Emil Björnson,…☆14May 18, 2018Updated 8 years ago
- ☆18Apr 15, 2025Updated last year
- This is the CUDA GPU implementation + Python interface (using PyTorch) of DCI. The paper can be found at https://arxiv.org/abs/1512.00442…☆13Dec 20, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Simulation code for the paper "FedSL: Federated Split Learning for Collaborative Healthcare Analytics on Resource-Constrained Wearable Io…☆17Feb 2, 2024Updated 2 years ago
- Simulation codes for over-the-air federated learning via second-order optimization☆14Jan 27, 2022Updated 4 years ago
- Algorithms re-implementation for paper "Power Allocation in Cache-Aided NOMA Systems: Optimization and Deep Reinforcement Learning Approa…☆12Jan 7, 2023Updated 3 years ago
- Artifacts Release: A Case for Stateless Mobile Core Network Functions in Space☆16Aug 16, 2022Updated 3 years ago
- This repository is about the paper and the code of "Beam grouping based user scheduling in multi-cell millimeter-wave MIMO systems".☆12Sep 23, 2021Updated 4 years ago
- Compression-based decentralized stochastic gradient descent (DSGD) algorithms tailored for digital and analog wireless implementations☆13Jun 26, 2022Updated 3 years ago
- Matlab code associated with the publication "Load Modulation for Backscatter Communication: Channel Capacity and Capacity-Approaching Fin…☆14Nov 15, 2022Updated 3 years ago