ArmenJeddi / saintLinks

a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity

☆29

Alternatives and similar repositories for saint

Users that are interested in saint are comparing it to the libraries listed below

Sorting:

vbdi / divprune
[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
☆35Updated last month
Theia-4869 / CDPruner
Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.
☆34Updated 2 weeks ago
OpenSparseLLMs / CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
☆42Updated 9 months ago
hasanar1f / HiRED
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…
☆39Updated 2 months ago
csarron / PuMer
[ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
☆32Updated 9 months ago
xuyang-liu16 / VidCom2
🚀 Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models
☆24Updated last month
yuecao0119 / MMFuser
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …
☆56Updated 8 months ago
Theia-4869 / FasterVLM
Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.
☆82Updated 2 weeks ago
qhfan / RALA
[CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention
☆25Updated 4 months ago
NUS-HPC-AI-Lab / SGL
☆22Updated 4 months ago
SJTU-DeepVisionLab / FLoRA
☆39Updated 11 months ago
ywh187 / FitPrune
☆53Updated 2 months ago
scale-lab / MTLoRA
The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)
☆55Updated last week
techmonsterwang / iLLaMA
Adapting LLaMA Decoder to Vision Transformer
☆28Updated last year
sdc17 / CrossGET
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
☆33Updated 6 months ago
LeapLabTHU / InLine
Official repository of InLine attention (NeurIPS 2024)
☆49Updated 6 months ago
savadikarc / gift
GIFT: Generative Interpretable Fine-Tuning
☆20Updated 9 months ago
OpenGVLab / PVC
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆43Updated last month
GATECH-EIC / Castling-ViT
[CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
☆30Updated last year
liuting20 / MustDrop
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
☆30Updated 6 months ago
NUS-HPC-AI-Lab / Dynamic-Tuning
The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"
☆46Updated 6 months ago
MPSC-UMBC / Efficient-Vision-Language-Models-A-Survey
[2025] Efficient Vision Language Models: A Survey
☆20Updated 2 weeks ago
locuslab / llava-token-compression
☆42Updated 8 months ago
jie040109 / MLAE
The official PyTorch implementation of the paper "MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning"
☆28Updated 7 months ago
lzhxmu / VTW
Code release for VTW (AAAI 2025) Oral
☆44Updated 5 months ago
liuting20 / Sparse-Tuning
☆29Updated last year
KD-TAO / DyCoke
[CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
☆58Updated 2 weeks ago
yu-rp / Dimple
Dimple, the first Discrete Diffusion Multimodal Large Language Model
☆72Updated last week
leo-yangli / VB-LoRA
This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).
☆39Updated 9 months ago
Yaxin9Luo / Gamma-MOD
[ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models
☆37Updated 5 months ago