DRSY/KV_Compression

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DRSY/KV_Compression)

DRSY / KV_Compression

[EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens

☆25

Alternatives and similar repositories for KV_Compression

Users that are interested in KV_Compression are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DRSY / EasyKV
View on GitHub
Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)
☆62Feb 13, 2024Updated 2 years ago
DRSY / DGen
View on GitHub
[AAAI 2021]Knowledge-Driven Distractor Generation for Cloze-Style Multiple Choice Questions
☆22Jul 29, 2021Updated 4 years ago
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated last year
snu-mllab / Context-Memory
View on GitHub
Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)
☆64Apr 18, 2024Updated 2 years ago
LuLuLuyi / LongHeads
View on GitHub
[EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor
☆32Apr 8, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Sunmmyy / OTPR
View on GitHub
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
☆15Feb 26, 2025Updated last year
salesforce / simplification
View on GitHub
☆23Jun 25, 2026Updated 3 weeks ago
whn09 / VITA
View on GitHub
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
☆11Jun 16, 2025Updated last year
hccngu / DialCoT
View on GitHub
DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models
☆13Nov 2, 2023Updated 2 years ago
DRSY / EMO
View on GitHub
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
☆129Mar 7, 2024Updated 2 years ago
raymin0223 / fast_robust_early_exit
View on GitHub
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
☆67Sep 28, 2024Updated last year
lando22 / GPT-3T
View on GitHub
Building language models to predict more than one token ahead to enable further ahead predictions.
☆12May 22, 2025Updated last year
eth-lre / LLM_ICL
View on GitHub
ACL24
☆11Jun 7, 2024Updated 2 years ago
pppa2019 / swie_overmiss_llm4mt
View on GitHub
Code for "Improving Translation Faithfulness of Large Language Models via Augmenting Instructions"
☆12Aug 26, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nchen909 / CodeAttention
View on GitHub
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure, EMNLP 2022
☆13Dec 10, 2022Updated 3 years ago
xufangzhi / Genius
View on GitHub
[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework
☆72Jun 1, 2025Updated last year
hobinkwak / ExpectedGradients_IntegratedGradients_pytorch
View on GitHub
simple implementation of Expected Gradients and Integrated Gradients by pytorch
☆12May 11, 2022Updated 4 years ago
HarlynDN / WebCiteS
View on GitHub
[ACL'24] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
☆13Sep 11, 2024Updated last year
YiteWang / NTK-SAP
View on GitHub
[ICLR2023] NTK-SAP: Improving neural network pruning by aligning training dynamics
☆20May 1, 2023Updated 3 years ago
SqrtiZhang / openreview_ICRL2024_analysis
View on GitHub
☆10Nov 28, 2023Updated 2 years ago
RTkenny / RiskPO
View on GitHub
Official implementation of 'RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM Post-Training', accepted by ICLR 2026
☆18Oct 15, 2025Updated 9 months ago
OpenNLPLab / ETSC-Exact-Toeplitz-to-SSM-Conversion
View on GitHub
[EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…
☆14Oct 17, 2023Updated 2 years ago
d-matrix-ai / keyformer-llm
View on GitHub
Keyformer proposes KV Cache reduction through key tokens identification and without the need for fine-tuning
☆57Mar 26, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ttwthomas / nanogpt
View on GitHub
fork of karparthy's nanogpt with custom datasets
☆11Jul 25, 2023Updated 2 years ago
Winsleo / General-parallel-genetic-algorithm-on-GPU
View on GitHub
基于CUDA的GPU加速通用遗传算法实现，实验平台为Nvidia Jetson Nano
☆13Mar 23, 2023Updated 3 years ago
BofangJia / SDM-Policy
View on GitHub
Score and Distribution Matching Policy: Advanced accelerated Visuomotor Policies via matched distillation
☆11May 9, 2025Updated last year
Cranial-XIX / longhorn
View on GitHub
Official PyTorch Implementation of the Longhorn Deep State Space Model
☆57Dec 4, 2024Updated last year
Adaxry / Post-Instruction
View on GitHub
☆21Sep 5, 2023Updated 2 years ago
Zoeyyao27 / SirLLM
View on GitHub
This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM
☆60May 28, 2024Updated 2 years ago
DeepSoftwareAnalytics / Telly
View on GitHub
Replication package for ISSTA2023 paper - Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond
☆23Apr 9, 2023Updated 3 years ago
yliu-cs / PiTe
View on GitHub
[ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model
☆17Feb 13, 2025Updated last year
jayelm / gisting
View on GitHub
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
☆322Feb 14, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
WindyLee0822 / CTG
View on GitHub
Source code of “Reinforcement Learning with Token-level Feedback for Controllable Text Generation (NAACL 2024)
☆17Dec 8, 2024Updated last year
smonsays / hypernetwork-attention
View on GitHub
Official code for the paper "Attention as a Hypernetwork"
☆58Feb 24, 2026Updated 4 months ago
AdelWang / KD-CoT
View on GitHub
☆15Apr 22, 2024Updated 2 years ago
DRSY / MoTIS
View on GitHub
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
☆126May 11, 2023Updated 3 years ago
OpenNLPLab / Transnormer
View on GitHub
[EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer
☆65Jul 30, 2023Updated 2 years ago
GATECH-EIC / SuperTickets
View on GitHub
[ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
☆20Jul 7, 2022Updated 4 years ago
princeton-nlp / AutoCompressors
View on GitHub
[EMNLP 2023] Adapting Language Models to Compress Long Contexts
☆337Sep 9, 2024Updated last year