MLGroupJLU / RWKV-SurveyLinks

The official GitHub page for the survey paper "A Survey of RWKV".

☆27

Alternatives and similar repositories for RWKV-Survey

Users that are interested in RWKV-Survey are comparing it to the libraries listed below

Sorting:

WailordHe / DenseSSM
A repository for DenseSSMs
☆88Updated last year
leo-yangli / VB-LoRA
This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).
☆39Updated 9 months ago
badripatro / mamba360
State Space Models
☆70Updated last year
JieShibo / MoLE
[ICML 2025 Oral] Mixture of Lookup Experts
☆47Updated 2 months ago
THUDM / Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models
Parameter-Efficient Fine-Tuning for Foundation Models
☆79Updated 4 months ago
AwesomeSeq / Comba-triton
☆45Updated last month
GATECH-EIC / Linearized-LLM
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆33Updated last year
TsinghuaC3I / Fourier-Position-Embedding
[ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization
☆84Updated 2 months ago
Caiyun-AI / MUDDFormer
☆79Updated 2 months ago
kyegomez / Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆103Updated this week
fla-org / flash-bidirectional-linear-attention
Triton implement of bi-directional (non-causal) linear attention
☆52Updated 6 months ago
Chaos96 / fourierft
☆147Updated 11 months ago
kyegomez / TTL
Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"
☆25Updated 2 weeks ago
yikangshen / MoA
Mixture of Attention Heads
☆48Updated 2 years ago
assafbk / DeciMamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)
☆28Updated 3 months ago
qiuzh20 / RMoE
Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
☆22Updated last year
waltonfuture / MM-UPT
Unsupervised GRPO
☆41Updated last month
htqin / IR-QLoRA
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆67Updated last year
MambaMixer / M2
☆47Updated last year
Adamdad / rational_kat_cu
☆69Updated 6 months ago
yyyyychen / LowMemoryBP
The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"
☆20Updated 7 months ago
jxiw / M1
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
☆37Updated 3 weeks ago
chuanyang-Zheng / DAPE
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆38Updated 9 months ago
VILA-Lab / GBLM-Pruner
Are gradient information useful for pruning of LLMs?
☆46Updated last year
g-luo / vlm_cross_modal_reps
Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025
☆29Updated 3 months ago
savadikarc / wegeft
WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models
☆21Updated 3 weeks ago
piotrpiekos / MoSA
User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice rou…
☆23Updated 3 months ago
OpenNLPLab / HGRN2
HGRN2: Gated Linear RNNs with State Expansion
☆55Updated 11 months ago
YuchuanTian / DiJiang
[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…
☆102Updated last year
SprocketLab / sparse_matrix_fine_tuning
Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"
☆20Updated 2 months ago