SprocketLab / sparse_matrix_fine_tuningLinks

Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"

☆22

Alternatives and similar repositories for sparse_matrix_fine_tuning

Users that are interested in sparse_matrix_fine_tuning are comparing it to the libraries listed below

Sorting:

IST-DASLab / RoSA
Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)
☆44Updated last year
smonsays / hypernetwork-attention
Official code for the paper "Attention as a Hypernetwork"
☆46Updated last year
GATECH-EIC / Linearized-LLM
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Updated last year
YuchuanTian / DiJiang
[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…
☆104Updated last year
fla-org / flash-bidirectional-linear-attention
Triton implement of bi-directional (non-causal) linear attention
☆56Updated 10 months ago
OpenNLPLab / LASP
Linear Attention Sequence Parallelism (LASP)
☆87Updated last year
kyegomez / Blockwise-Parallel-Transformer
32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.
☆49Updated 2 years ago
mit-han-lab / VisCompare
A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders
☆24Updated 9 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
Infini-AI-Lab / gsm_infinite
☆55Updated 5 months ago
fabienfrfr / tptt
😊 TPTT: Transforming Pretrained Transformers into Titans
☆37Updated 2 weeks ago
ylsung / rsq
Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"
☆20Updated 5 months ago
Infini-AI-Lab / S2FT
☆19Updated 11 months ago
BlinkDL / LinearAttentionArena
Here we will test various linear attention designs.
☆62Updated last year
Doraemonzzz / xmixers
Xmixers: A collection of SOTA efficient token/channel mixers
☆29Updated 3 months ago
LiqunMa / FBI-LLM
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
☆51Updated 3 months ago
TianjinYellow / SPAM-Optimizer
☆36Updated 8 months ago
IST-DASLab / QuEST
Work in progress.
☆75Updated 2 weeks ago
The-Inscrutable-X / TACQ
Official Repository for Task-Circuit Quantization
☆24Updated 6 months ago
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆102Updated last year
IST-DASLab / SparseFinetuning
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆42Updated last year
rayleizhu / vllm-ra
[ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts
☆40Updated last year
facebookresearch / MobileLLM-R1
MobileLLM-R1
☆68Updated 2 months ago
MarkXCloud / CSpD
The official repo of continuous speculative decoding
☆30Updated 8 months ago
yuzhenmao / IceFormer
Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).
☆25Updated 4 months ago
sail-sg / SkyLadder
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆40Updated last month
pprp / Pruner-Zero
[ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
☆96Updated last year
Qichuzyy / POA
Official implementation of ECCV24 paper: POA
☆24Updated last year
tridao / flash-attention-wheels
☆58Updated 2 years ago
OpenNLPLab / HGRN2
HGRN2: Gated Linear RNNs with State Expansion
☆55Updated last year