MLGroupJLU / RWKV-Survey
The official GitHub page for the survey paper "A Survey of RWKV".
☆25Updated 2 months ago
Alternatives and similar repositories for RWKV-Survey:
Users that are interested in RWKV-Survey are comparing it to the libraries listed below
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆25Updated 8 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆53Updated 7 months ago
- Triton implement of bi-directional (non-causal) linear attention☆44Updated last month
- ☆16Updated 2 years ago
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆19Updated 7 months ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆30Updated 9 months ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Updated last year
- Mixture of Attention Heads☆43Updated 2 years ago
- A repository for DenseSSMs☆87Updated 11 months ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆41Updated last month
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆53Updated last year
- ☆54Updated last month
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆44Updated 3 weeks ago
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆21Updated 2 months ago
- Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML'24)☆29Updated 7 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆63Updated 11 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆37Updated 5 months ago
- ☆23Updated 6 months ago
- Scaling Sparse Fine-Tuning to Large Language Models☆16Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆26Updated last year
- ☆14Updated last year
- Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"☆13Updated last year
- ☆20Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆48Updated 2 years ago
- ☆21Updated 2 years ago
- Here we will test various linear attention designs.☆60Updated 11 months ago
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆14Updated 8 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆52Updated 2 months ago
- ☆48Updated last year
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆39Updated last year