chenyaofo / CCA-AttentionLinks
☆13Updated last week
Alternatives and similar repositories for CCA-Attention
Users that are interested in CCA-Attention are comparing it to the libraries listed below
Sorting:
- ☆46Updated 2 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆39Updated 10 months ago
- The official GitHub page for the survey paper "A Survey of RWKV".☆28Updated 7 months ago
- ☆23Updated 3 weeks ago
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆26Updated 7 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆51Updated 2 months ago
- A repository for DenseSSMs☆88Updated last year
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆50Updated 2 months ago
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆28Updated 4 months ago
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆60Updated last month
- ☆15Updated 9 months ago
- a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity☆30Updated 3 months ago
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆25Updated 2 months ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆16Updated 8 months ago
- Flash-Linear-Attention models beyond language☆16Updated last month
- ☆24Updated 3 months ago
- ☆147Updated 11 months ago
- ☆22Updated last year
- Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"☆60Updated last year
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆40Updated 10 months ago
- ☆26Updated last year
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆56Updated last week
- Codes for Merging Large Language Models☆33Updated last year
- ☆72Updated 6 months ago
- ☆98Updated 4 months ago
- ☆20Updated 2 months ago
- ☆55Updated last month
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆124Updated last month
- State Space Models☆70Updated last year
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆17Updated last month