chenyaofo / CCA-AttentionLinks
☆13Updated last month
Alternatives and similar repositories for CCA-Attention
Users that are interested in CCA-Attention are comparing it to the libraries listed below
Sorting:
- ☆48Updated 2 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆39Updated 11 months ago
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆62Updated 2 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆52Updated 3 months ago
- a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity☆35Updated 3 months ago
- The official GitHub page for the survey paper "A Survey of RWKV".☆29Updated 8 months ago
- ☆25Updated last month
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆31Updated 5 months ago
- ☆148Updated last year
- Codes for Merging Large Language Models☆33Updated last year
- ☆18Updated last month
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆16Updated 2 months ago
- ☆25Updated 3 months ago
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆25Updated 3 months ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆16Updated 9 months ago
- ☆13Updated 7 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆79Updated 2 months ago
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆40Updated 11 months ago
- ☆24Updated 4 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆60Updated 2 months ago
- ICLR 2025☆28Updated 3 months ago
- PyTorch implementation of StableMask (ICML'24)☆14Updated last year
- ☆57Updated 2 months ago
- [EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models☆82Updated last year
- MokA: Multimodal Low-Rank Adaptation for MLLMs☆22Updated 2 months ago
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆26Updated 7 months ago
- ☆15Updated 10 months ago
- Code for Heima☆52Updated 4 months ago
- User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice rou…☆26Updated 4 months ago
- ☆22Updated last year